# KernRift Language Reference **KernRift** is a bare-metal systems programming language and compiler created by Pantelis Christou. It compiles itself. It runs on Linux, Windows, macOS, and Android across x86_64 and ARM64 without any C toolchain, runtime, or libc. This document describes what the language actually is. Every feature listed here is implemented in the compiler you just installed — if you hit something that doesn't work, it's a bug, not a typo in the docs. --- ## Table of Contents 1. [File structure and comments](#1-file-structure-and-comments) 2. [Types](#2-types) 3. [Variables and assignment](#3-variables-and-assignment) 4. [Operators](#4-operators) 5. [Control flow](#5-control-flow) 6. [Functions](#6-functions) 7. [Structs, methods, and enums](#7-structs-methods-and-enums) 8. [Arrays](#8-arrays) 9. [Slice parameters](#9-slice-parameters) 10. [Static variables and constants](#10-static-variables-and-constants) 11. [Pointer operations](#11-pointer-operations) 12. [Volatile and atomic](#12-volatile-and-atomic) 13. [Device blocks (MMIO)](#13-device-blocks-mmio) 14. [Inline assembly](#14-inline-assembly) 15. [Floating-point types](#15-floating-point-types) 16. [Allocators and memory management](#16-allocators-and-memory-management) 17. [Imports](#17-imports) 18. [Built-in functions](#18-built-in-functions) 19. [Annotations](#19-annotations) 20. [Compiler CLI](#20-compiler-cli) 21. [Living compiler](#21-living-compiler) 22. [Language profiles (#lang)](#22-language-profiles-lang) 23. [Freestanding mode](#23-freestanding-mode) 24. [Extern functions](#24-extern-functions) 25. [Binary formats](#25-binary-formats) --- ## 1. File structure and comments KernRift source files use the `.kr` extension. One file is one module. A program starts execution at `fn main()` (unless you pass `--freestanding`). ```kr // Line comment /* Block comment. Can span multiple lines. */ fn main() { println("Hello, KernRift!") exit(0) } ``` Statements do not require trailing semicolons. Semicolons are accepted and ignored — useful when you want to write multiple statements on one line. --- ## 2. Types ### Scalar types | Type | Width | Alias | Notes | |-----------|-------|-------|-------------------------------| | `uint8` | 1 B | `u8`, `byte` | Unsigned byte | | `uint16` | 2 B | `u16` | Unsigned 16-bit | | `uint32` | 4 B | `u32` | Unsigned 32-bit | | `uint64` | 8 B | `u64`, `addr` | Unsigned 64-bit, pointer-sized | | `int8` | 1 B | `i8` | Signed byte | | `int16` | 2 B | `i16` | Signed 16-bit | | `int32` | 4 B | `i32` | Signed 32-bit | | `int64` | 8 B | `i64` | Signed 64-bit | | `f16` | 2 B | | IEEE 754 half-precision (storage only on ARM64) | | `f32` | 4 B | `float` | IEEE 754 single-precision — full arithmetic, literals `1.5f` | | `f64` | 8 B | `double` | IEEE 754 double-precision — full arithmetic, default for float literals (`1.5`, `2e10`, `3.14`) | | `bool` | 1 B | | `true` / `false` (strict, since v2.8.3) | | `char` | 1 B | | Single byte holding a character literal (`'A'`, `'\n'`, …); strict since v2.8.3 | All integer values are stored as 64-bit words in variable slots. The specific width matters for pointer load/store and for struct field layout. The short aliases (`u8`, `u64`, `i32`, …) are exact synonyms for the long form. Floating-point types keep their declared width (f32 in 32-bit slots, f64 in 64-bit slots) and are tracked through the IR with a per-vreg "fkind" tag so the emitter picks the right load/store/convert instructions. Full floating-point details (operators, conversions, the `std/math_float.kr` library) live in §15. ### `bool` (strict since v2.8.3) ```kr bool ok = true // ok bool done = false // ok bool b = 1 // compile error — int literal not assignable to bool ``` Inside `if`/`while`, the compiler still accepts any integer (`0` false, non-zero true), so `if str_eq(a, b) { ... }` works even though `str_eq` returns `u64`. The type strictness only bites on variable declarations and struct fields — it stops `uint64 flag = true` being silently coerced. ### `char` (strict since v2.8.3) ```kr char c = 'A' // stored as byte 65 char nl = '\n' // stored as byte 10 if c == 'A' { ... } // mixing char with its int value works char bad = 97 // compile error — int literal not assignable ``` ### Literals - Decimal: `42`, `1000000` - Hex: `0x1000`, `0xDEADBEEF` - Float: `1.5`, `-3.14`, `2e10`, `1.5f` (f32 suffix) - Bool: `true`, `false` (strict `bool` type) - String: `"hello"` with `\n`, `\t`, `\\`, `\"`, `\0` escapes - Character: `'A'`, `'\n'`, `'\t'`, `'\r'`, `'\0'`, `'\\'`, `'\''` — evaluates to the byte value of the character (e.g. `'A'` is 65, `'\n'` is 10). Use them directly in comparisons and arithmetic: `if c == 'a' { ... }`. - f-string: `f"pi = {3.14}, answer = {x}"` — `{expr}` interpolates with type-directed formatting (integers, floats, bools, chars, `@string` slots), `{{`/`}}` escape. --- ## 3. Variables and assignment ```kr TYPE name = initializer TYPE name // uninitialized — garbage contents name = new_value ``` The type precedes the name (C-style, not Rust-style). ```kr u32 status = 0 u64 base = 0x3F000000 u8 byte = 0xFF ``` ### `let` (type inference) `let name = expr` declares a local whose type is inferred from the initializer, so you don't repeat the type: ```kr let count = 0 // u64 let total = a + b // type of a let ok = x < limit // bool let value = lookup(key) // the function's return type let pi = 3.14159 // f64 ``` An initializer is required — there is nothing to infer from otherwise, so `let x` is a compile error. Inference covers integer/float/bool literals, identifiers, calls, arithmetic, comparisons, ternaries and match expressions. For a struct value, declare it with an explicit type (`Point p = ...`) rather than `let` for now. `let` is a local-only convenience; parameters, fields and statics still spell out their types. ### Compound assignment | Op | Meaning | |----|----------------| | `+=` | add | | `-=` | subtract | | `*=` | multiply | | `/=` | divide | | `%=` | remainder | | `&=` | bitwise AND | | `\|=` | bitwise OR | | `^=` | bitwise XOR | | `<<=` | left shift | | `>>=` | right shift | --- ## 4. Operators Expressions are parsed with a Pratt parser. Precedence from tightest to loosest (matching `binop_precedence` in `src/parser.kr` and the table in `docs/GRAMMAR.md`): | Precedence | Operators | Notes | |------------|----------------------------------|--------------------------| | (prefix) | `!`, `~`, `-` | Logical not, bitwise not, negation | | 8 | `&`, `\|`, `^` | Bitwise AND / OR / XOR | | 7 | `<<`, `>>` | Shift | | 6 | `*`, `/`, `%` | Multiply, divide, remainder | | 5 | `+`, `-` | Add, subtract | | 4 | `<`, `<=`, `>`, `>=` | Comparison (signedness follows operand types) | | 3 | `==`, `!=` | Equality | | 2 | `&&` | Logical AND | | 1 | `\|\|` | Logical OR | | 0 | `?:` | Ternary (right-assoc, see §5) | > **This differs from C.** Bitwise operators and shifts bind *tighter* > than arithmetic and comparisons: `1 << 2 * 3` is `(1 << 2) * 3` = 12, > `6 & 1 == 0` is `(6 & 1) == 0` (true), and `2 + 3 & 4` is `2 + (3 & 4)` > = 2. Parenthesize when porting C code. `<`, `<=`, `>`, `>=` are **type-directed**: when either operand has a signed type (`i8`..`i64`) the comparison is signed; otherwise it is unsigned. The `signed_lt` / `signed_gt` / `signed_le` / `signed_ge` built-ins force a signed comparison regardless of operand types (useful on raw `u64` bit patterns). --- ## 5. Control flow ### if / else ```kr if x > 10 { println("big") } else { println("small") } ``` Parentheses around the condition are optional. `else if` works as a chain: ```kr if n < 0 { println("negative") } else if n == 0 { println("zero") } else if n < 10 { println("small") } else { println("big") } ``` ### ternary (`? :`) `cond ? then_value : else_value` is an expression that picks one of two values. It has the lowest precedence (below `||`) and is right-associative, so it nests cleanly in either arm: ```kr let max = a > b ? a : b let sign = n < 0 ? 0 - 1 : n > 0 ? 1 : 0 exit(ok ? 0 : 1) ``` Only the chosen arm is evaluated — the other is short-circuited, so calls in the unused arm don't run. ### while ```kr u64 i = 0 while i < 10 { println(i) i = i + 1 } ``` ### for (range) ```kr for i in 0..n { println(i) } ``` `0..n` is an **exclusive** range — `i` takes values `0, 1, ..., n-1`. The inclusive form `0..=n` visits `n` as well. The `in` keyword is optional: `for i 0..10` also parses. ### break and continue ```kr while true { if done { break } if skip { continue } // ... } ``` ### loop `loop { ... }` is an infinite loop — sugar for `while true`. Exit with `break` or `return`: ```kr u64 n = 0 loop { n = n + 1 if n == 3 { break } } ``` ### defer `defer { ... }` schedules a block to run at every function exit — each `return` (including tuple returns) and the implicit fall-through — in LIFO order when there are several. `exit(n)` bypasses defers. Requires the IR backend (the default); `--legacy` rejects it. ```kr fn demo() { defer { println_str("done") } // runs at every exit point println_str("working") // prints "working" then "done" } ``` ### match ```kr match opcode { 1 => { println("one") } 2 => { println("two") } 3 => { println("three") } _ => { println("other") } // default arm: matches anything } ``` Arms are tested top-to-bottom. A pattern is an integer literal, a named integer constant, a comma-separated list (`1, 2, 3 => ...`), a range (`0..=31 => ...`, IR backend only), or `_` for a catch-all default arm. If no arm matches and there is no `_`, the match is a no-op. An arm body can be a brace block or a single bare statement — the braces are optional for one statement: ```kr match code { 0 => return ok() 1 => exit(1) _ => log("unknown") } ``` **`match` as an expression.** When used in value position, each arm is a single expression and the whole `match` yields the matching arm's value (or `0` if nothing matches and there is no `_`): ```kr let name = match day { 0 => "Sun" 6 => "Sat" _ => "weekday" } exit(match status { 0 => 0 _ => 1 }) ``` ### return ```kr fn get_value() -> u64 { return 42 } fn do_thing() { return // void return — also fine to just fall off the end } ``` --- ## 6. Functions ```kr fn name(TYPE param1, TYPE param2) -> RETURN_TYPE { // body return value } ``` The return type after `->` is optional; omitting it means the function returns void. Parameters are `TYPE name` — type first. ```kr fn add(u64 a, u64 b) -> u64 { return a + b } fn greet(u64 name) { print("Hello, ") print_str(name) println("!") } ``` Recursion and mutual recursion work — function order within a file doesn't matter. ### Calling functions ```kr u64 r = add(2, 3) greet("world") ``` Up to 6 arguments are passed in registers on x86_64 (`rdi rsi rdx rcx r8 r9`, on every OS) and up to 8 on arm64 (`x0..x7`). Functions with more arguments pass the overflow on the stack. ### Type parameters (generics) A function may declare type parameters with `` or ``: ```kr fn max_t(T a, T b) -> T { if a > b { return a } return b } fn main() { exit(max_t(3, 42)) // 42 } ``` Type parameters are **syntactic only** in the current implementation: every scalar is a 64-bit slot, so `T` is effectively `u64` at codegen time. There is no monomorphization and no type checking across instantiations — `max_t(3, 42)` and `max_t(struct_ptr_a, struct_ptr_b)` compile to the same machine code. Use the syntax when it makes the caller clearer; don't rely on it for type safety. ### Tuple return and destructure (2 or 3 elements) A function can return two or three values and the caller can destructure them in one statement — `return (a, b)` / `return (a, b, c)` paired with `(u64 x, u64 y) = call()` / `(u64 x, u64 y, u64 z) = call()`. Tuples are limited to exactly two or three elements — four or more requires a struct (or an out-pointer parameter). ```kr fn divmod(u64 x, u64 y) -> u64 { return (x / y, x % y) } fn main() { (u64 q, u64 r) = divmod(17, 5) println(q) // 3 println(r) // 2 exit(0) } ``` Runtime convention: - **x86_64** — first value in `rax`, second in `rdx`, third in `r8`. All three are caller-saved on the SysV ABI, so the extra values flow through the epilogue untouched. - **arm64** — first in `x0`, second in `x1`, third in `x2`. Same AAPCS64 reasoning. The function's declared return type stays scalar (`-> u64` above) — the tuple shape lives entirely in the `return (a, b)` expression and the `(T1 a, T2 b) = call(…)` destructure. If you `return (a, b)` from a function but only call it as a scalar expression, you get the first value and the second is silently discarded. Calling a scalar-returning function as a destructure picks up whatever the callee happened to leave in `rdx` / `x1` (likely garbage). There is no arity check yet — match the two sides yourself. Destructuring is only recognised at statement position, and both element types must be type keywords (`u8`..`u64`, `i8`..`i64`). You can't destructure a struct field or an element of a literal tuple; the RHS must be an expression whose tail evaluates into the two return registers — in practice, a call to a tuple-returning function. --- ## 7. Structs, methods, and enums ### Structs ```kr struct Point { u64 x u64 y } ``` Field layout is packed — no alignment padding. Fields are stored in declaration order at increasing offsets. Field sizes are determined by their type (`u8` = 1 byte, `u32` = 4 bytes, `u64` = 8 bytes, etc.). ```kr Point p // stack-allocated struct value p.x = 10 p.y = 20 println(p.x) ``` ### Heap-allocated structs A struct variable can also be initialized with an expression that returns a pointer — typically `alloc(size)`. When written this way, the variable holds the pointer and field access dereferences it: ```kr struct Node { u64 value u64 next } fn main() { Node a = alloc(16) // a holds the heap pointer Node b = alloc(16) a.value = 10 a.next = b // field store on a pointer-backed struct b.value = 20 b.next = 0 Node cur = a while cur != 0 { println(cur.value) cur = cur.next // reassign pointer variable } exit(0) } ``` This is the idiomatic form for linked lists, BSTs, graph nodes, and any tree-shaped data. Field size is inferred from the struct declaration just like stack structs; the only difference is that the variable's slot holds a pointer to heap memory instead of stack memory. Reassigning the pointer variable is allowed, so traversal patterns like `cur = cur.next` work as expected. ### Methods Attach a function to a struct with `fn StructName.method_name(StructName self, ...)`: ```kr struct Point { u64 x u64 y } fn Point.sum(Point self) -> u64 { return self.x + self.y } fn main() { Point p p.x = 10 p.y = 20 u64 total = p.sum() // 30 println(total) exit(0) } ``` The method receives `self` as a reference to the struct on the caller's stack — `self.field` reads and writes work normally. ### Enums ```kr enum Color { Red = 0 Green = 1 Blue = 2 } ``` `Color.Red`, `Color.Green`, `Color.Blue` are named integer constants usable in any integer context (assignments, comparisons, match arms, switch bases, etc.). Enums are a compile-time convenience; no runtime object is created. > **Reminder**: `<`, `<=`, `>`, `>=` follow the operand types (see §4): > signed when an operand is `i8`..`i64`, unsigned otherwise. Values that > can go negative — an AVL balance factor, a graph distance — should be > declared with a signed type (`i64`), or compared with the > `signed_lt`/`signed_le`/`signed_gt`/`signed_ge` builtins when they > live in unsigned slots. This trips people up in tree and heap code > surprisingly often. --- ## 8. Arrays ### Local arrays ```kr u8[256] buffer // byte buffer u16[16] samples // 16 × 2-byte values u32[10] pixels // 10 × 4-byte values u64[10] numbers // 10 × 8-byte values buffer[0] = 0xAA numbers[2] = 300 u64 first = numbers[0] ``` Local arrays are allocated on the stack. The element size follows the declared type — `u64[10]` reserves 80 bytes, `u32[10]` reserves 40, etc. Indexing is scaled automatically (`numbers[2]` loads 8 bytes from offset `2*8`). The variable holds a pointer to the first element, so `buffer` alone evaluates to the base address. Indexing is unchecked. ### Static arrays At module level, a static array gets storage in the data section: ```kr static u8[1024] message_buf // 1024 bytes static u16[16] sensor_samples // 32 bytes static u32[10] pixel_row // 40 bytes static u64[10] counters // 80 bytes fn main() { message_buf[0] = 72 // 'H' message_buf[1] = 105 // 'i' message_buf[2] = 0 counters[0] = 1000000 counters[9] = 2000000 print_str(message_buf) exit(0) } ``` All integer element widths (`u8`/`u16`/`u32`/`u64`, `i8`/`i16`/`i32`/`i64`) are supported and indexing is scaled automatically — `counters[5]` reads 8 bytes from offset `5*8`. (In compilers older than 2.6.3, wider element types silently miscompiled; upgrade if you see garbage reads.) Static arrays are zero-initialized by the loader. ### Struct arrays Fixed-size arrays of struct instances work both locally and statically: ```kr struct Point { u64 x; u64 y } fn main() { Point[10] pts pts[0].x = 1 pts[0].y = 2 pts[5].x = 50 println(pts[5].x) exit(0) } ``` Element indexing uses the struct's full size as stride. `pts[i].field` is a first-class syntax that reads and writes the `field` at the correct offset within element `i`. --- ## 9. Slice parameters A slice parameter `[TYPE] name` is sugar for a fat pointer: a `(ptr, len)` pair passed as two separate arguments. Inside the function, `data.len` reads the length, and `data` is a plain pointer for indexing. ```kr fn sum_bytes([u8] data) -> u64 { u64 total = 0 u64 i = 0 u64 n = data.len while i < n { total = total + load8(data + i) i = i + 1 } return total } fn main() { u8[6] buf buf[0] = 10 buf[1] = 20 buf[2] = 30 // Caller passes (pointer, length) — two arguments u64 t = sum_bytes(buf, 3) println(t) exit(0) } ``` The caller side explicitly passes the length as a normal second argument. This is the classic C `(ptr, len)` pattern with a nicer symbolic name for the length inside the callee. --- ## 10. Static variables and constants ### static ```kr static u64 counter = 0 static u64 gpio_base = 0x3F200000 fn tick() { counter = counter + 1 } ``` Static variables live in the data section for the lifetime of the program. A literal initializer — `= 42`, `= 0x3F200000`, `= 'A'`, `= true`, with an optional leading `-` or `~` — is honoured and emitted into the binary. Without an initializer the slot is zero (BSS). Non-literal initializers (calls, named constants, arithmetic such as `= 5 + 3`) are **not** evaluated: only a leading literal, if any, is kept and the rest is silently dropped (`= 5 + 3` stores 5; tracked as issue #53). Set those values at startup instead. ### const ```kr const u64 BAUD = 115200 const u64 UART_BASE = 0x3F201000 ``` `const` creates a compile-time integer constant. At use sites the value is inlined — there is no runtime storage. --- ## 11. Pointer operations KernRift has no dedicated pointer type. Addresses are just `u64` values. To read or write memory at an address, use the pointer built-ins: ### The easy way ```kr u64 v = load64(addr) // read a 64-bit value u32 x = load32(addr) // read a 32-bit value u16 h = load16(addr) // read a 16-bit value u8 b = load8(addr) // read a single byte store64(addr, 0xDEADBEEF) // write 64 bits store32(addr, 0x1234) // write 32 bits store16(addr, 0x5678) // write 16 bits store8(addr, 0xAA) // write 1 byte ``` The load builtins zero-extend the read into a full `u64`. The store builtins write exactly the specified width. ### The verbose way (unsafe blocks) You can also write the raw pointer syntax: ```kr u64 val = 0 unsafe { *(addr as u32) -> val } // load unsafe { *(addr as u8) = some_byte } // store ``` The cast type determines access width. Supported cast types: `u8`, `u16`, `u32`, `u64`, `i8`, `i16`, `i32`, `i64` (plus the long forms `uint8`..`int64`, and `f16`/`f32`/`f64` for float-typed access). `unsafe { ... }` is just a marker block — it accepts exactly **one** pointer statement (one load or one store). Use a separate `unsafe` block for each additional access. The `load*` / `store*` builtins are equivalent and much easier to read — prefer them unless you have a reason to use `unsafe` blocks. --- ## 12. Volatile and atomic ### Volatile: MMIO-safe loads and stores For memory-mapped I/O, the compiler must not reorder, elide, or cache the access, and the memory operation must complete before anything after it. ```kr u32 v = vload32(mmio_addr) // volatile load, barrier after vstore32(mmio_addr, 0x01) // volatile store, barrier before ``` All widths are available: `vload8`, `vload16`, `vload32`, `vload64`, `vstore8`..`vstore64`. The barrier emitted is: - **x86_64**: `mfence` (full memory fence) - **ARM64**: `DSB SY` (data synchronization barrier — waits for completion, not just ordering) `volatile { *(addr as u32) = val }` is the equivalent block form and does the same thing. ### Atomic operations Lock-free atomic primitives are available as builtins: ```kr u64 v = atomic_load(addr) atomic_store(addr, v) u64 old = atomic_cas(addr, expected, desired) // compare-and-swap u64 old = atomic_add(addr, delta) // returns old value u64 old = atomic_sub(addr, delta) u64 old = atomic_and(addr, mask) u64 old = atomic_or(addr, mask) u64 old = atomic_xor(addr, mask) ``` These compile to `LOCK`-prefixed instructions on x86_64 and `LDXR`/`STXR` exclusive pairs on ARM64. `atomic_cas` returns `1` on success, `0` on failure. --- ## 13. Device blocks (MMIO) For driver code, a `device` block describes a hardware register set at a fixed base address. Field reads and writes compile directly to volatile loads and stores of the right width — with the proper memory barriers. ```kr device UART0 at 0x3F201000 { Data at 0x00 : u32 Flag at 0x18 : u32 IBRD at 0x24 : u32 FBRD at 0x28 : u32 LCRH at 0x2C : u32 Ctrl at 0x30 : u32 rw } fn putc(u8 c) { // Spin until TX FIFO has room while (UART0.Flag & 0x20) != 0 { } UART0.Data = c } ``` Syntax: - `device NAME at ADDR { ... }` declares a device rooted at `ADDR`. - `FIELD at OFFSET : TYPE [rw|ro|wo]` declares a register. The access specifier (`rw`, `ro`, `wo`) is currently optional and parsed-but-ignored — future versions will enforce it. - Supported field types: `u8`, `u16`, `u32`, `u64` (and signed variants). A read like `UART0.Data` emits a `vloadN` of the right width at `0x3F201000 + 0x00`. A write like `UART0.Ctrl = 1` emits a `vstoreN` with the appropriate barrier. Device blocks sit on top of the volatile builtins — there is no hidden mechanism, just a convenient named-register syntax. --- ## 14. Inline assembly The `asm` keyword emits raw machine instructions at the call site. ### Single instruction ```kr asm("nop") asm("cli") asm("sti") ``` ### Multi-instruction block ```kr asm { "cli"; "mov rax, cr0"; "sti" } ``` ### Raw hex bytes When the assembler doesn't recognize a mnemonic, drop to hex: ```kr asm("0x0F 0x01 0xD9") // vmmcall (x86_64) asm("0xD503201F") // nop (ARM64) ``` ### Supported instructions **x86_64**: `nop`, `ret`, `hlt`, `int3`, `iretq`, `cli`, `sti`, `cpuid`, `rdmsr`, `wrmsr`, `lgdt [rax]`, `lidt [rax]`, `invlpg [rax]`, `ltr ax`, `swapgs`, control-register moves (`mov cr0, rax`, etc.), port I/O (`in al, dx`, `out dx, al`, wide variants). **ARM64**: `nop`, `ret`, `eret`, `wfi`, `wfe`, `sev`, barriers (`isb`, `dsb sy/ish`, `dmb sy/ish`), `svc #N`, and `mrs` / `msr` for 20+ system registers including `SCTLR_EL1`, `VBAR_EL1`, `TCR_EL1`, `MAIR_EL1`, `MPIDR_EL1`, `CurrentEL`. For anything not in the built-in table, use the raw hex form. ### I/O constraints Any `asm(...)` or `asm { ... }` may be followed by `in(...)`, `out(...)`, and/or `clobbers(...)` clauses that describe how registers flow between the block and KernRift's local variables. ```kr import "std/fmt.kr" fn rdtsc_ns() -> u64 { u64 lo = 0 u64 hi = 0 asm { "rdtsc" } out(rax -> lo, rdx -> hi) return (hi << 32) | lo } fn main() { println_str(fmt_dec(rdtsc_ns())) exit(0) } ``` A `cpuid` helper with both inputs and outputs: ```kr fn cpuid_signature() -> u64 { u64 leaf = 0 u64 zero = 0 u64 a = 0 u64 b = 0 u64 c = 0 u64 d = 0 asm { "cpuid" } in(leaf -> rax, zero -> rcx) out(rax -> a, rbx -> b, rcx -> c, rdx -> d) return (b << 32) | c } ``` **Clause semantics**: - `in( -> , ...)` — before the block runs, KernRift emits a `mov , ` for each pair. Inputs are load-only; the named variable is not updated after the block. - `out( -> , ...)` — after the block runs, KernRift emits a `mov , ` for each pair. Outputs are store-only. - `clobbers(, ...)` — accepted syntactically but currently **advisory**. You still must list every register your block writes under `out(...)` if you need its value, and every register whose prior contents you don't care about should not be relied on after the block. The compiler does not yet save/restore clobbered callee-saved registers. **Register names**: - **x86_64**: `rax` `rcx` `rdx` `rbx` `rsp` `rbp` `rsi` `rdi` `r8` … `r15`. No 32-bit or 8-bit aliases yet — use the 64-bit form even if the instruction operates on a sub-register. - **ARM64**: `x0` … `x30`. No `w` (32-bit) aliases. **Limitations** (V1): - Clauses must come immediately after the closing `)` or `}` of the asm form, before any other statement. - Clobbers list is parsed but emits no save/restore code — list an output or pick non-conflicting registers. - Only integer GPRs are accepted; no SSE/NEON register constraints. - No memory-operand constraints (Rust's `in("rax") [ptr]` — not yet). - Pinned-parameter inputs (rbx/r12 on x86_64, picked by the compiler for parameter slots 0 and 1) are handled correctly — KernRift emits a reg-reg move instead of a stack reload so pinning stays transparent. --- ## 15. Floating-point types KernRift supports IEEE 754 floating-point types: `f32` (single, 32-bit), `f64` (double, 64-bit), and `f16` (half, 16-bit, storage-only — no arithmetic, use `f16_to_f32` / `f32_to_f16` for conversion). ### Literals ```kr f64 x = 3.14 // f64 (default) f64 y = 0.001 f32 w = 3.14f // f32 (suffix) ``` ### Arithmetic ```kr f64 a = int_to_f64(6) f64 b = int_to_f64(7) f64 c = a * b // 42.0 f64 d = a + b - c / a ``` Operators `+`, `-`, `*`, `/` work on matching float types. Mixing float and integer in one expression is a compile error — use the explicit conversion builtins. ### Comparisons ```kr if a < b { ... } if a == b { ... } ``` All comparison operators (`<`, `>`, `<=`, `>=`, `==`, `!=`) work. NaN follows IEEE 754: `NaN == NaN` is false. Test for NaN with `x != x` (true only for NaN). ### Conversions (explicit, no implicit coercion) | Builtin | Description | |---|---| | `int_to_f64(u64) -> f64` | Integer to double | | `int_to_f32(u64) -> f32` | Integer to single | | `f64_to_int(f64) -> u64` | Double to integer (truncates toward zero) | | `f32_to_int(f32) -> u64` | Single to integer | | `f32_to_f64(f32) -> f64` | Widen single to double | | `f64_to_f32(f64) -> f32` | Narrow double to single | | `f32_to_f16(f32) -> f16` | Single to half (storage) | | `f16_to_f32(f16) -> f32` | Half to single | ### Math library (`std/math_float.kr`) ```kr import "std/math_float.kr" f64 r = sqrt(int_to_f64(49)) // 7.0 (hardware) f64 s = sin(f64_pi()) // ~0.0 f64 e = exp(int_to_f64(1)) // ~2.718 println_str(fmt_f64(e, 6)) // "2.718281" ``` | Function | Description | |---|---| | `sqrt(f64) -> f64` | Square root (hardware) | | `abs_f(f64) -> f64` | Absolute value | | `neg_f(f64) -> f64` | Negation | | `sin(f64) -> f64` | Sine | | `cos(f64) -> f64` | Cosine | | `tan(f64) -> f64` | Tangent | | `exp(f64) -> f64` | Exponential (e^x) | | `log(f64) -> f64` | Natural logarithm | | `pow(f64, f64) -> f64` | Power (x^y) | | `floor(f64) -> f64` | Floor | | `ceil(f64) -> f64` | Ceiling | | `fmt_f64(f64, u64) -> u64` | Format as decimal string | | `fmt_f32(f32, u64) -> u64` | Format f32 as decimal string | ### Function ABI Float arguments use the float register file independently from integer arguments: - **x86_64 SysV**: `xmm0`–`xmm7` for float args, return in `xmm0` - **ARM64 AAPCS**: `d0`–`d7` for float args, return in `d0` ```kr fn lerp(f64 a, f64 b, f64 t) -> f64 { return a + (b - a) * t } ``` ### Precision | Type | Reliable decimal digits | Range | |---|---|---| | `f16` | ~3 | ±65504 | | `f32` | ~7 | ±3.4 × 10³⁸ | | `f64` | ~15 | ±1.8 × 10³⁰⁸ | --- ## 16. Allocators and memory management ```kr import "std/alloc.kr" ``` KernRift ships three allocators in the standard library. All are backed by `mmap`/`VirtualAlloc` with no libc dependency. ### Low-level: `alloc` / `dealloc` `alloc(size)` maps a new region and stores an 8-byte size header before the returned pointer. `dealloc(ptr)` reads that header and calls `munmap` (Linux/macOS) or `VirtualFree` (Windows) to release the pages. Previous releases left `dealloc` as a no-op; it now frees for real. ### Arena allocator Bump-pointer allocator. Fast, no per-object free. Good for request-scoped or phase-scoped work where you free everything at once. ```kr u64 a = arena_new(65536) // 64 KiB slab u64 p1 = arena_alloc(a, 128) // bump 128 bytes u64 p2 = arena_alloc(a, 256) // bump 256 bytes arena_reset(a) // rewind to start (no munmap) (u64 total, u64 live) = arena_stats(a) arena_destroy(a) // munmap; warns if bytes still live ``` ### Pool allocator Fixed-size slot allocator with an embedded free list. Constant-time alloc and free. Ideal for many same-sized objects (nodes, handles). ```kr u64 pool = pool_new(64, 1024) // 1024 slots of 64 bytes each u64 obj = pool_alloc(pool) pool_free(pool, obj) (u64 capacity, u64 used) = pool_stats(pool) pool_destroy(pool) // warns if slots still in use ``` ### Heap allocator General-purpose variable-size allocator. First-fit with forward coalescing on free. Use when allocation sizes vary. ```kr u64 h = heap_new(1048576) // 1 MiB slab u64 buf = heap_alloc(h, 4096) heap_free(h, buf) (u64 total, u64 freed, u64 live) = heap_stats(h) heap_destroy(h) // warns if blocks still allocated ``` ### API summary | Function | Returns | Description | |---|---|---| | `arena_new(capacity)` | arena handle | Create arena with `capacity` bytes | | `arena_alloc(arena, size)` | pointer | Bump-allocate `size` bytes (8-byte aligned) | | `arena_reset(arena)` | — | Rewind used offset to 0 | | `arena_destroy(arena)` | — | Release slab; leak warning if bytes live | | `arena_stats(arena)` | `(total, live)` | Cumulative allocated bytes, currently live bytes | | `pool_new(obj_size, count)` | pool handle | Create pool of `count` fixed-size slots | | `pool_alloc(pool)` | pointer | Pop a slot from the free list | | `pool_free(pool, ptr)` | — | Return slot; poisons + sets canary | | `pool_destroy(pool)` | — | Release slab; leak warning if slots in use | | `pool_stats(pool)` | `(capacity, used)` | Total slots, currently used slots | | `heap_new(capacity)` | heap handle | Create heap with `capacity` bytes | | `heap_alloc(heap, size)` | pointer | First-fit allocate (8-byte aligned) | | `heap_free(heap, ptr)` | — | Free + forward coalesce; poisons + canary | | `heap_destroy(heap)` | — | Release slab; leak warning if blocks allocated | | `heap_stats(heap)` | `(total, freed, live)` | Bytes allocated, bytes freed, bytes live | ### Safety features All three allocators share the same hardening: - **Guard pages** — A `PROT_NONE` page is mapped at the end of every slab. Buffer overruns hit an immediate `SIGSEGV` instead of silently corrupting adjacent memory. - **Double-free detection** — Pool and heap write a `0xDEADBEEFDEADBEEF` canary into freed slots/blocks. A second free of the same pointer prints a diagnostic and calls `exit(1)`. - **Use-after-free poison** — Freed memory is filled with `0xEF` bytes (pool) or the canary pattern (heap). Reads of freed data return obviously wrong values instead of stale data. - **Leak warnings** — `arena_destroy`, `pool_destroy`, and `heap_destroy` walk their metadata and print to stderr if any allocations were not freed (or not reset, for arenas). --- ## 17. Imports Bring functions and declarations from another file into the current compilation unit: ```kr import "std/io.kr" import "std/string.kr" import "utils.kr" ``` Import paths are resolved: 1. Relative to the importing file's directory 2. Then in the standard library location: `~/.local/share/kernrift/` (or `%LOCALAPPDATA%\KernRift\share\` on Windows) Circular imports are detected and rejected. Each file is compiled at most once regardless of how many files import it. --- ## 18. Built-in functions All of these are compiler intrinsics — no runtime library, no imports needed. ### I/O | Function | Description | |---|---| | `print(a, b, ...)` | Typed, variadic (v2.8.3). Each arg is formatted according to its type: string literals emitted as-is, integers as decimal, floats via `fmt_f64`/`fmt_f32`, bools as `true`/`false`, chars as a single byte. Args are space-separated; no trailing newline. | | `println(a, b, ...)` | Same, plus a newline. | | `print_str(s)` | Print a null-terminated string from a pointer variable (for results of `int_to_str`, `fmt_hex`, etc.). | | `println_str(s)` | Same, plus a newline. | | `write(fd, buf, len)` | Write `len` bytes from `buf` to file descriptor `fd`. | | `file_open(path, flags)` | Open a file. Returns a descriptor. | | `file_read(fd, buf, len)` | Read up to `len` bytes. Returns bytes read. | | `file_write(fd, buf, len)` | Write `len` bytes. Returns bytes written. | | `file_close(fd)` | Close a descriptor. | | `file_size(fd)` | Return the size of an open file. | **f-strings** (v2.8.3): `f"x = {x}, pi ≈ {3.14}"` interpolates each `{expr}` with the same type-directed formatter `print` uses. `{{` and `}}` escape braces. The surrounding string segments are emitted verbatim, so f-strings compose with variadic `println`: ```kr println(f"result = {answer} ({percent}%)") ``` **When to prefer `*_str`:** `print(variable)` formats the variable as a decimal integer (or float/bool/char, based on its static type). If your variable holds a string *pointer* — e.g. the return of `int_to_str` or a manually-built buffer — reach for `print_str` / `println_str`. ### Memory | Function | Description | |---|---| | `alloc(size)` | Heap-allocate `size` bytes. Returns a pointer. | | `dealloc(ptr)` | Free a previously allocated block. | | `memcpy(dst, src, len)` | Copy `len` bytes. | | `memset(dst, val, len)` | Fill `len` bytes with `val`. | | `str_len(s)` | Length of a null-terminated string. | | `str_eq(a, b)` | 1 if two null-terminated strings are equal, 0 otherwise. | | `sizeof(T)` | Compile-time byte size of a scalar type or struct. `sizeof(u64)` is 8; `sizeof(SomeStruct)` is its packed size (fields are laid out with no padding — see §7). Folds to a constant at compile time. | ### Pointer load/store | Function | Description | |---|---| | `load8/16/32/64(addr)` | Read a value of the given width, zero-extended to `u64`. | | `store8/16/32/64(addr, val)` | Write a value of the given width. | | `vload8/16/32/64(addr)` | Volatile load with barrier — for MMIO. | | `vstore8/16/32/64(addr, val)` | Volatile store with barrier — for MMIO. | ### Atomic | Function | Description | |---|---| | `atomic_load(ptr)` | Sequentially-consistent load. | | `atomic_store(ptr, val)` | Sequentially-consistent store. | | `atomic_cas(ptr, exp, new)` | Compare-and-swap. Returns 1 on success. | | `atomic_add/sub/and/or/xor(ptr, val)` | RMW, returns old value. | ### Bit manipulation | Function | Description | |---|---| | `bit_get(v, n)` | Bit `n` of `v` (0 or 1). | | `bit_set(v, n)` | Return `v` with bit `n` set. | | `bit_clear(v, n)` | Return `v` with bit `n` cleared. | | `bit_range(v, start, width)` | Extract `width` bits starting at `start`. | | `bit_insert(v, start, width, bits)` | Insert `bits` into `v` at position `start`. | ### Signed comparison The normal `<`, `<=`, `>`, `>=` operators are type-directed — signed when an operand is `i8`..`i64`, unsigned otherwise (see §4). To force a signed comparison on unsigned operands (raw `u64` bit patterns): ```kr signed_lt(a, b) signed_gt(a, b) signed_le(a, b) signed_ge(a, b) ``` ### Platform and process | Function | Description | |---|---| | `exit(code)` | Terminate the process with an exit code. | | `get_target_os()` | Host OS: `0`=Linux, `1`=macOS, `2`=Windows, `3`=Android. | | `get_arch_id()` | Compile-time arch ID: `1` Linux x86_64, `2` Linux arm64, `3` Win x86_64, `4` Win arm64, `5` macOS x86_64, `6` macOS arm64, `7` Android arm64, `8` Android x86_64. | | `exec_process(path)` | Spawn and wait for a process (argv = `{path, NULL}`). Returns exit code. | | `exec_process_argv(path, argv)` | Like `exec_process` but with an explicit NULL-terminated `argv` pointer array. | | `set_executable(path)` | `chmod +x` equivalent. | | `time_ns()` | Monotonic clock reading in nanoseconds (`CLOCK_MONOTONIC`). | | `get_module_path(buf, size)` | Write the current binary's path into `buf`. | | `fmt_uint(buf, val)` | Format `val` as decimal into `buf`. Returns length. | | `syscall_raw(nr, a1, a2, a3, a4, a5, a6)` | Raw syscall with up to 6 args. | ### Function pointers | Function | Description | |---|---| | `fn_addr(name)` | Get the address of a named function. The name is a string literal, resolved at link time. | | `call_ptr(addr, ...)` | Call a function by address with any number of arguments. The caller's signature must match the target's or the result is undefined. | Example — passing a comparator to a generic sort-ish loop: ```kr fn asc(u64 a, u64 b) -> u64 { return a < b } fn desc(u64 a, u64 b) -> u64 { return a > b } fn sorted(u64 a, u64 b, u64 cmp) -> u64 { if call_ptr(cmp, a, b) != 0 { return a } return b } fn main() { u64 c = fn_addr("asc") exit(sorted(3, 7, c)) // → 3 } ``` ### Cache and memory-ordering builtins (ARM64 / x86) | Function | ARM64 | x86_64 | Description | |---|---|---|---| | `isb()` | `ISB` | nop | Instruction-sync barrier. | | `dsb()` | `DSB SY` | `MFENCE` | Full data-sync barrier — waits for completion. | | `dmb()` | `DMB ISH` | `MFENCE` | Data-memory barrier (inner-shareable). | | `dcache_flush(addr)` | `DC CIVAC + DSB ISH + ISB` | `CLFLUSH + MFENCE` | Writeback + invalidate one cache line. | | `icache_invalidate(addr)` | `IC IVAU + DSB ISH + ISB` | nop (coherent) | Invalidate one I-cache line. | --- ## 19. Annotations Annotations appear immediately before a function or struct declaration. ### `@export` Marks a function for inclusion in the output binary's symbol table (for linking or ELF object introspection). ```kr @export fn my_entry() { } ``` ### `@section("name")` Places the function in a named section for kernel / bare-metal layouts. Under `--emit=asm` the listing emits a gas-style directive (`.section .text.init,"ax",@progbits`) before the function's label, so the output round-trips through GNU as + ld with a user-supplied linker script. ```kr @section(".text.init") fn _start() { /* placed in a separate section */ } ``` Under `--emit=obj` the name is captured but the ELF relocatable still groups all code into `.text` — full multi-section object emit is on the roadmap. ### `@naked` Emits a function with no prologue/epilogue. Useful for interrupt handlers and low-level entry points that manage their own stack. ```kr @naked fn isr() { asm { "cli"; "nop"; "iretq" } } ``` ### `@noreturn` Marks a function that never returns (e.g. `panic`, infinite loops). The compiler omits the epilogue. ```kr @noreturn fn panic() { write(2, "panic\n", 6) while true { asm("hlt") } } ``` ### `@packed` Accepted on struct declarations. KernRift structs are *already* packed (no alignment padding), so this annotation is currently a no-op that documents intent. ```kr @packed struct Header { u8 kind u32 length u8 flags } ``` ### Analysis annotations: `@ctx`, `@eff`, `@caps`, `@acquires` / `@releases` A separate family of annotations feeds the optional effect / capability / lock analysis passes (`krc check`): `@ctx(...)` declares an execution context, `@eff(...)` the effects a function may perform, `@caps(...)` the capabilities it requires, and `@acquires(...)` / `@releases(...)` track lock ownership for the deadlock-cycle check. They are advisory pre-1.0 and do not affect codegen. See [EFFECT_SYSTEM.md](EFFECT_SYSTEM.md) for the full model. --- ## 20. Compiler CLI ```sh krc # compile to .krbo (fat binary, all 8 slices) krc -o out # specify output name krc --arch=x86_64 -o out # single-arch native ELF krc --arch=arm64 -o out # single-arch ARM64 ELF krc --targets=linux-x64,macos-arm64 -o out.krbo # custom fat subset (v2.8.x) # Single target (one platform, host or cross) instead of a fat binary: krc --target=linux -o out krc --target=macos -o out krc --target=windows -o out.exe krc --target=android -o out # Emit format (aliased since v2.8.4): # elfexe / elf → Linux ELF # pe → Windows PE # macho → macOS Mach-O # android → Android PIE ELF # obj (or -c) → ELF relocatable (.o / .obj) # lkm → Linux kernel module (.ko) — see docs/LKM.md # asm → annotated assembly listing (to -o path) # ir → SSA IR dump per function (to stdout) krc --emit=pe -o out.exe krc --emit=macho -o out krc --emit=android -o out krc -c -o out.o # shorthand for --emit=obj krc --emit=lkm -o mod.ko # Linux loadable kernel module (x86_64 only) # Codegen backend & optimization krc --arch=arm64 # default: IR (SSA + optimizer + regalloc) krc --legacy --arch=arm64 # legacy direct-walking codegen krc --ir # force IR even where a recipe falls back to legacy krc --no-coalesce # disable Briggs/George copy coalescing (default on) krc --no-check-types # disable the type checker (default on) krc --O0 # disable the IR optimizer (CF/DCE/CSE/LICM) krc --debug # runtime safety checks (bounds, null, some div-by-zero) krc -g -o out # emit DWARF debug info (.debug_line/info/abbrev/str) # Non-compile modes krc --freestanding -o out # no main trampoline, no auto-exit krc check # static checks only (semantic + type checker) krc fmt # auto-format: prints formatted source to stdout krc lc # living compiler report (section 21) krc lc --fix # apply auto-fixes in place krc lc --fix --dry-run # preview auto-fixes without writing krc lc --ci # CI gate: exit non-zero if patterns fire krc lc --min-fitness=N # filter: only patterns with fitness >= N krc lc --list-proposals # print the proposal registry krc lc --promote # promote a proposal to stable krc lc --deprecate # mark a proposal as deprecated krc lc --reject # revert a proposal to experimental krc --emit=ir # dump the SSA IR for a single function krc --version # print the compiler version krc --help # usage info ``` For debugging — `-g` DWARF in gdb, the `--debug` trap table, and the `--O0` → `--no-coalesce` → `--legacy` miscompile bisection ladder — see [docs/DEBUGGING.md](DEBUGGING.md). ### Static checks: the type checker A static type checker runs by default on every compile (and under `krc check`). Its errors are **fatal** — they abort the build with a `file:line:col` message, source line, and caret. Pass `--no-check-types` to disable it (e.g. to compile a file it rejects while you investigate). It is a focused checker centred on struct/float/void misuse, not a full Hindley–Milner system — integer-width mismatches, for instance, are not flagged. What it catches: | Category | Example that is rejected | |----------|--------------------------| | Field access on a non-struct | `u64 n = 5` then `n.x` | | Unknown field on a struct | `p.nope` where `nope` isn't a field of `p`'s type | | Two different struct types mixed | `Q q = some_P` (init, assignment, argument, return, or match pattern) | | Arithmetic on a struct value | `p + q` | | Ordering comparison on a struct | `p < q` (`==`/`!=` are allowed — field-by-field equality) | | Void function used as a value | `u64 x = print_str(s)` | | Return value/void mismatch | `return x` in a void fn, or bare `return` in a value fn | | Float-kind mismatch on return | returning an `f32` from an `-> f64` function | | `match` on a float scrutinee | `match some_f64 { ... }` | | Ternary / `match`-expr arms mixing float and int | `c ? 1.5 : 2` | Because KernRift treats a struct-typed variable as a typed pointer, a struct and a raw pointer/`u64` mix freely (`P p = alloc(16)` is fine); only two *different* struct types clash. ### `kr` runner ```sh kr program.krbo # run a fat binary on any platform kr program.krbo arg1 arg2 # forward args to the child kr --version kr --help ``` The `kr` runner auto-detects the host architecture (x86_64 / arm64 / Linux / Windows / macOS / Android), extracts the matching slice from `.krbo`, BCJ-unfilters the decompressed code, and execves it. On Android (Linux ≥ 3.17) it uses `memfd_create` + `execveat(AT_EMPTY_PATH)` to bypass SELinux file-exec restrictions without writing to cwd; older kernels fall back to a `/data/local/tmp/kr-exec` / cwd temp file plus a `exit(120)` shell-wrapper trampoline. --- ## 21. Living compiler `krc lc` analyses KernRift source and produces a two-layer report. The living compiler separates concerns into a **stable semantic core** (correctness and structural issues) and an **adaptive surface layer** (ergonomic migrations that lower to the same IR). This lets the language evolve without destroying compatibility. ### Basic report ```sh krc lc file.kr ``` Output has three sections: a telemetry summary, a fitness score (layer-weighted, 0–100), and the patterns detected in each layer. Patterns tagged `(auto-fix available)` can be rewritten mechanically. ### CI gating ```sh krc lc --min-fitness=60 file.kr # filter: only patterns with fitness >= 60 krc lc --ci file.kr # exit non-zero if any pattern fires krc lc --ci --min-fitness=50 file.kr # gate only on patterns >= 50 ``` ### Migration engine (auto-fix) ```sh krc lc --fix file.kr # rewrite in place krc lc --fix --dry-run file.kr # preview the rewritten source ``` The migration engine currently handles the `legacy_ptr_ops` pattern: - `unsafe { *(addr as T) -> dest }` → `dest = loadN(addr)` - `unsafe { *(addr as T) = val }` → `storeN(addr, val)` Both forms lower to identical code at the codegen level, so the rewrite is safe by construction. ### Proposal registry The living compiler ships with a registry of candidate syntax evolutions, each tagged with a lifecycle state (`experimental`, `stable`, or `deprecated`): ```sh krc lc --list-proposals ``` Proposals with triggers that match the current file fire inline in the report. Under `#lang stable` (the default), only stable proposals fire. Under `#lang experimental`, experimental proposals also fire as "coming-soon" hints. ### Governance: persistent per-project state Each project can override the compiler's baseline proposal states and store them in a `.kernrift/proposals` file at the project root: ```sh krc lc --promote # move a proposal to `stable` krc lc --deprecate # move a proposal to `deprecated` krc lc --reject # revert to `experimental` ``` The first invocation creates `.kernrift/proposals`. Subsequent runs of `krc lc` in that directory automatically load the overrides. The format is one line per proposal: ``` slice_for_buffer_params stable tail_call_intrinsic experimental extern_fn_decls deprecated ``` This is how the governance layer actually works — the compiler has a baseline, each project can pin its own decisions, and everything is version-controlled alongside the source. See [`docs/LIVING_COMPILER.md`](LIVING_COMPILER.md) for the full blueprint and the pipeline design. --- ## 22. Language profiles (`#lang`) A source file may pin its required language profile on the first line: ```kr #lang stable fn main() { // only features promoted to the stable surface are allowed println("hello") exit(0) } ``` ```kr #lang experimental fn main() { // experimental features are also allowed exit(0) } ``` Recognized profiles: | Profile | Meaning | |---|---| | `stable` | Default. All stable features. Safe for production code. | | `experimental` | Also allows features under active development. | The directive must be the first non-empty line of the file. If absent, the profile defaults to `stable`. Profiles are part of the Living Compiler's two-layer model: the stable semantic core doesn't change, but the adaptive surface layer may gate certain features (like `tail_call()` or `extern fn` when those are added) behind `#lang experimental`. This lets the language evolve without breaking existing files — pin a file to `stable` and it keeps compiling forever, even as new experimental features enter the language. --- ## 23. Freestanding mode `krc --freestanding` produces a binary suitable for bare-metal: - No automatic `exit(0)` at the end of `main`. - No OS-specific syscall wrappers injected. - The ELF entry point (`e_entry`) still points at `main` — you must provide `fn main()`. If you want a different name (e.g. `_start`), keep `fn main()` as the trampoline and have it call into your entry function. ```sh krc --freestanding --arch=arm64 kernel.kr -o kernel.elf ``` Use this for kernel entry points, bootloaders, and embedded firmware. The programmer is responsible for setting up the stack and handling any return from `main`. Mark functions that never return with `@noreturn` so the compiler skips the return-path check; annotate interrupt handlers with `@naked` to suppress the prologue/epilogue. Freestanding example: ```kr @noreturn fn main() { // kernel entry — set up your own state, never returns u64 vga = 0xB8000 store16(vga + 0, 0x0F48) // 'H' bright white store16(vga + 2, 0x0F69) // 'i' while true { } } ``` ### Stack size warnings The legacy backend (`--legacy`) prints a warning to stderr when a function's stack frame exceeds 49152 bytes (the default IR backend does not currently implement this warning): ``` warning: large stack frame (60032 bytes) in function 'parse_module' ``` This catches accidental large local arrays that could overflow a kernel stack. Big dispatch functions with many mutually exclusive branches legitimately allocate slots across branches; the threshold is set high enough to let those pass. --- ## 24. Extern functions `extern fn` declares a function that is resolved by the platform linker at link time. It has no body — the signature names an external symbol (typically from libc or another static library): ```kr extern fn strlen(u64 s) -> u64 extern fn write(u64 fd, u64 buf, u64 len) -> u64 fn main() { u64 msg = "hello from KernRift via libc!\n" write(1, msg, strlen(msg)) exit(0) } ``` Compile to a relocatable object and link with the platform toolchain: ```sh # Linux krc --emit=obj extern_libc.kr -o extern_libc.o gcc extern_libc.o -o extern_libc -no-pie # macOS krc --target=macos --emit=obj extern_libc.kr -o extern_libc.o clang extern_libc.o -o extern_libc # Windows krc --target=windows --emit=obj extern_libc.kr -o extern_libc.obj link extern_libc.obj msvcrt.lib /ENTRY:main /SUBSYSTEM:console ``` The compiler emits relocations in the native format of each target: | Target | Format | Relocation | |---------------|---------|---------------------------| | Linux x86_64 | ELF | `R_X86_64_PLT32` | | Linux ARM64 | ELF | `R_AARCH64_CALL26` | | macOS x86_64 | Mach-O | `X86_64_RELOC_BRANCH` | | macOS ARM64 | Mach-O | `ARM64_RELOC_BRANCH26` | | Windows x64 | COFF | `IMAGE_REL_AMD64_REL32` | | Windows ARM64 | COFF | `IMAGE_REL_ARM64_BRANCH26`| `extern fn` names shadow built-ins: if you declare `extern fn write(...)`, calls to `write` resolve to the libc symbol instead of the `write` syscall built-in. This lets you opt into the platform runtime on demand. Note that programs that call buffered libc functions (like `printf` or `puts`) from `main()` should exit via a libc `exit()` rather than the built-in `exit()` — the built-in uses a raw syscall that bypasses libc's stdio flush on exit. The safest pattern is to declare `extern fn exit` and use that: ```kr extern fn exit(u64 code) extern fn puts(u64 s) -> u64 fn main() { puts("flushed through stdio") exit(0) } ``` --- ## 25. Binary formats | Format | Produced by | Use | |---|---|---| | `.krbo` fat binary | default (no `--arch`) | Cross-platform distribution — `kr` picks the right slice | | ELF executable | `--arch=x86_64` / `--arch=arm64` on Linux | Native Linux binary | | ELF relocatable | `--emit=obj` | Link into an external object (`.o`) | | Mach-O | `--emit=macho` | macOS executable (x86_64 or arm64) | | PE | `--emit=pe` | Windows `.exe` | | Android PIE ELF | `--emit=android` | Android ARM64 (default) or x86_64 (`--arch=x86_64`) | | Assembly listing | `--emit=asm` | Human-readable disassembly with labels | A `.krbo` fat binary packs up to 8 platform slices (Linux x86_64, Linux ARM64, Windows x86_64, Windows ARM64, macOS x86_64, macOS ARM64, Android ARM64, Android x86_64), each BCJ+LZ4 compressed. The `kr` runner extracts and executes the slice matching the current host at startup. --- ## Appendix A. ABI reference This is a quick reference for anyone reading the code `krc` generates or linking it against other toolchains. It's the minimum you need to reason about register allocation, interoperate with C, or write `@naked` functions. ### x86_64 | Target | Arg regs (1..6/8) | Return | Callee-saved | Stack align at CALL | |---------|-------------------------------------|--------|---------------------------------|---------------------| | Linux | `rdi rsi rdx rcx r8 r9` (then stack) | `rax` | `rbx rbp r12 r13 r14 r15 rsp` | 16 | | macOS | same (System V) | `rax` | same | 16 | | Windows | `rcx rdx r8 r9` (then stack, +32 shadow) | `rax` | `rbx rbp rdi rsi rsp r12..r15 xmm6..xmm15` | 16 | - KernRift currently allocates only GPRs — no XMM usage in generated code, so the caller-saved XMM registers are irrelevant to user code but matter when you link against C. - On Windows, the first 32 bytes of the stack below `rsp` at call time are a **shadow** area owned by the callee. `krc` allocates it for you. - `@naked` functions get no prologue/epilogue — you're responsible for stack alignment if you call into user code. ### arm64 (AArch64) | Target | Arg regs | Return | Callee-saved | Syscall nr in | |--------------|-----------|--------|--------------------|---------------| | Linux | `x0..x7` | `x0` | `x19..x28 sp fp lr` | `x8` | | macOS | `x0..x7` | `x0` | same | `x16` | | Android | `x0..x7` | `x0` | same | `x8` | | Windows arm64| `x0..x7` | `x0` | same | (no syscalls; uses kernel32 IAT) | ## Appendix B. Syscall numbers `krc`'s builtins lower to real kernel syscalls. The table below is the number used by each builtin on each supported (OS × arch) target. Useful when reading `--emit=asm` output, stepping through with a debugger, or writing portable code that uses `syscall_raw`. ### Linux x86_64 | Builtin | nr | C name | |------------|-----|-----------------| | `write` | 1 | `write` | | `read` | 0 | `read` | | `exit` | 231 | `exit_group` | | `alloc` | 9 | `mmap` | | `dealloc` | 11 | `munmap` | | `file_open`| 2 | `open` | | `file_read`| 0 | `read` | | `file_write`|1 | `write` | | `file_close`|3 | `close` | | `time_ns` | 228 | `clock_gettime` | | `set_executable` | 90 | `chmod` | `syscall_raw(nr, a1, a2, a3, a4, a5, a6)` passes `nr` in `rax` and the arguments in `rdi rsi rdx r10 r8 r9` (standard Linux x86_64 ABI). The table above covers every `krc` builtin that lowers to a syscall — for anything else you're calling directly, get the number from the kernel's own table at [`arch/x86/entry/syscalls/syscall_64.tbl`](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl). Example: `getpid` is syscall 39 — `uint64 pid = syscall_raw(39, 0, 0, 0, 0, 0, 0)`. ### Linux arm64 | Builtin | nr | |------------|-----| | `write` | 64 | | `read` | 63 | | `exit` | 93 | | `alloc` | 222 (`mmap`) | | `dealloc` | 215 (`munmap`) | | `file_open`| 56 (`openat`) | | `time_ns` | 113 (`clock_gettime`) | | `set_executable` | 53 (`fchmodat`) | `syscall_raw` passes nr in `x8` and args in `x0..x5`. Complete numbering list: Linux kernel [`include/uapi/asm-generic/unistd.h`](https://github.com/torvalds/linux/blob/master/include/uapi/asm-generic/unistd.h) (arm64 uses the generic table). ### macOS x86_64 macOS syscall numbers use the high nibble to encode the syscall class (2 = Unix class). The numbers below are the full 32-bit values passed in `rax`; arguments go in `rdi rsi rdx rcx r8 r9` like Linux. | Builtin | nr | C name | |------------|-------------|----------| | `exit` | `0x2000001` | `exit` | | `write` | `0x2000004` | `write` | | `read` | `0x2000003` | `read` | | `alloc` | `0x20000C5` | `mmap` | ### macOS arm64 On arm64 macOS, the syscall number goes in **`x16`** (not `x8` as on Linux). Numbers are the plain Darwin numbers, not the class-tagged form. | Builtin | nr | |------------|-----| | `exit` | 1 | | `write` | 4 | | `read` | 3 | | `alloc` | 197 | Darwin syscall table (both arches): xnu [`bsd/kern/syscalls.master`](https://github.com/apple-oss-distributions/xnu/blob/main/bsd/kern/syscalls.master). On x86_64 macOS, OR the base number with `0x2000000` to form the `rax` value (e.g. `exit` = `1 | 0x2000000 = 0x2000001`). ### Windows Windows x86_64 and arm64 do not use direct syscalls — every I/O and process-control builtin lowers to a call through the binary's Import Address Table (IAT) against `kernel32.dll`: | Builtin | kernel32 import | |--------------------|----------------------------| | `exit` | `ExitProcess` | | `write` | `GetStdHandle` + `WriteFile` | | `read` | `GetStdHandle` + `ReadFile` | | `alloc` | `VirtualAlloc` | | `dealloc` | `VirtualFree` | | `file_open` | `CreateFileA` | | `file_read` | `ReadFile` | | `file_write` | `WriteFile` | | `file_close` | `CloseHandle` | | `exec_process` | `CreateProcessA` + `WaitForSingleObject` + `GetExitCodeProcess` + `ExitProcess` | | `set_executable` | no-op (Windows has no executable bit) | `syscall_raw` is **not supported** on Windows — the platform has no stable syscall numbering. The `--target=windows` PE output uses IAT imports exclusively. ## Appendix C. `--emit=obj` section layout A relocatable object file (`.o` on Linux/macOS, `.obj` on Windows) produced by `--emit=obj` contains the minimum set of sections the platform linker needs. No `.rodata`, no `.bss`, no `.data` — string literals and static scalars are placed at the end of `.text` and referenced with RIP-relative addressing. ### Linux x86_64 / arm64 (ELF) | Index | Name | Type | Purpose | |-------|-------------------|-----------|---------| | 0 | (null) | NULL | required by ELF | | 1 | `.text` | PROGBITS | code + string literals + static scalars | | 2 | `.data` | PROGBITS | (emitted empty — static data lives inside `.text`) | | 3 | `.symtab` | SYMTAB | every `fn` is a symbol; `main` is `GLOBAL`, others `LOCAL` | | 4 | `.strtab` | STRTAB | symbol name strings | | 5 | `.shstrtab` | STRTAB | section header names | | 6 | `.note.GNU-stack` | PROGBITS (flags=0) | marks the binary as non-exec-stack so `ld` doesn't warn | | 7 | `.rela.text` | RELA | only present if the program uses `extern fn` | Relocation types for `extern fn` call sites: - **x86_64**: `R_X86_64_PLT32` (disp32 = -4 addend) - **arm64**: `R_AARCH64_CALL26` (addend 0) ### macOS x86_64 / arm64 (Mach-O) One `__TEXT,__text` section containing code + string literals. Symbol names are prefixed with an underscore (`_main`, `_write`) as required by the Darwin C ABI. `extern fn` call sites use relocations `X86_64_RELOC_BRANCH` (x86_64) and `ARM64_RELOC_BRANCH26` (arm64). ### Windows x86_64 / arm64 (COFF `.obj`) One `.text` section, one COFF symbol table. No underscore prefix on x86_64. `extern fn` call sites use relocations `IMAGE_REL_AMD64_REL32` (x86_64) and `IMAGE_REL_ARM64_BRANCH26` (arm64). ### Linking with gcc or clang ```sh # Linux krc --emit=obj prog.kr -o prog.o gcc prog.o -o prog -no-pie # No more "missing .note.GNU-stack" warning as of v2.6.3 — the compiler # emits the section by default so linked binaries get a non-executable # stack. ``` ## Appendix D. `.krbo` fat-binary format (v2) The runtime format for `.krbo` files — directly parseable without any KernRift toolchain. **Layout**: ``` offset size field 0x00 8 magic: "KRBOFAT\0" 0x08 4 version: u32 = 2 0x0C 4 arch_count: u32 (currently emitted as 8) 0x10 (arch_count × 48) arch descriptor table ... compressed slice blobs (per arch) ``` > **Note**: the descriptor reserves `runtime_offset` / `runtime_len` > for per-arch kr-runner blobs, but the current emitter writes them > as `0` and the runner ignores them. Decoders should treat those > fields as informational only. **Arch descriptor** (48 bytes each, one per slice): ``` offset size field +0x00 4 arch_id: u32 (see table below) +0x04 4 compression: u32 (1 = LZ4 frame, preceded by BCJ filter) +0x08 8 slice_offset: u64 (from start of file) +0x10 8 slice_comp_size: u64 +0x18 8 slice_uncomp_size: u64 +0x20 8 runtime_offset: u64 (reserved, emitted as 0) +0x28 8 runtime_len: u64 (reserved, emitted as 0) ``` **Arch IDs**: | id | OS | arch | |----|----------|--------| | 1 | Linux | x86_64 | | 2 | Linux | arm64 | | 3 | Windows | x86_64 | | 4 | Windows | arm64 | | 5 | macOS | x86_64 | | 6 | macOS | arm64 | | 7 | Android | arm64 | | 8 | Android | x86_64 | **Decompression**: each slice is LZ4-compressed with a BCJ filter applied *before* compression. On extraction the runner first LZ4- decompresses, then runs the matching BCJ filter in reverse to restore the original call/jmp offsets. BCJ filter selection: - x86-family arch_ids (1, 3, 5, 8): x86_64 BCJ filter (rewrites `E8`/`E9` disp32 offsets to absolute, for better compression). - arm-family arch_ids (2, 4, 6, 7): AArch64 BCJ filter (rewrites `BL` imm26 fields). Edge case: the x86_64 BCJ filter is a no-op when the slice is shorter than 5 bytes (the minimum length of an `E8`/`E9` disp32 instruction), and the arm64 filter is a no-op on slices shorter than 4 bytes. Both conditions happen only for pathologically tiny test programs and are safe — there is nothing to rewrite in either direction, so encode+decode remains a perfect round-trip. **Minimal Python decoder**: ```python import struct, lz4.frame def parse_krbo(path): d = open(path, 'rb').read() assert d[:8] == b'KRBOFAT\0' ver, n = struct.unpack_from('