--- name: security description: "Security hardening for OCaml libraries through systematic vulnerability research. Use when Claude needs to: (1) Research CVEs in similar implementations (C, Rust, Go, Python) and add regression tests, (2) Add fuzz tests for parsers and encoders, (3) Audit integer handling and buffer operations, (4) Test boundary conditions and malformed input, (5) Review cryptographic usage, (6) Add defensive checks against common vulnerability classes" --- # OCaml Security Audit Systematic security hardening through vulnerability research, defensive coding, and comprehensive testing. ## Core Philosophy 1. **Study the attacks first**: Research CVEs in equivalent C/Rust/Go/Python libraries before writing tests 2. **Assume hostile input**: Every parser, decoder, and protocol handler receives adversarial data 3. **Fail explicitly**: Reject malformed input early with clear errors, never silently corrupt 4. **Test the boundaries**: Edge cases at min/max values, empty input, and overflow points 5. **Defense in depth**: Multiple validation layers, even when one seems sufficient ## Workflow ### Phase 1: CVE Research Before writing any tests, research known vulnerabilities in equivalent implementations. **1. Identify comparable libraries:** ``` OCaml library → Research in ───────────────────────────────────── PNG/image parser → libpng, image-rs, Pillow TLS/crypto → OpenSSL, BoringSSL, rustls HTTP parser → http-parser, hyper, httptools JSON parser → json-c, serde_json, ujson XML parser → libxml2, quick-xml, defusedxml Archive (zip/tar) → libarchive, zip-rs, tarfile DNS resolver → c-ares, trust-dns, dnspython Compression → zlib, miniz, flate2 YAML parser → libyaml, serde_yaml, PyYAML PDF parser → poppler, pdf-rs, PyPDF2 ASN.1/X.509 → OpenSSL, ring, pyasn1 ``` **2. Search for CVEs:** ```bash # Search NVD database curl "https://services.nvd.nist.gov/rest/json/cves/2.0?keywordSearch=libpng+buffer+overflow" | jq '.vulnerabilities[].cve.descriptions[0].value' # Search CVE Details # https://www.cvedetails.com/vulnerability-search.php # Search GitHub Security Advisories gh api graphql -f query='{ securityAdvisories(first:10, ecosystem:PIP, keyword:"pillow") { nodes { summary description severity } } }' # Check OSV database curl "https://api.osv.dev/v1/query" -d '{"package": {"name": "pillow", "ecosystem": "PyPI"}}' ``` **3. Document findings** - track CVEs and map to tests. See `references/vulnerability-classes.md`. ### Phase 2: Vulnerability Classes For each vulnerability class, add targeted tests. #### Integer Handling ```ocaml (** Test integer overflow in length calculations. CVE pattern: libpng, zlib, many image libraries *) let test_length_overflow () = let huge_width = Int.max_int in let huge_height = 1 in match Image.create ~width:huge_width ~height:huge_height with | Error `Overflow -> () | Error _ -> Alcotest.fail "wrong error type" | Ok _ -> Alcotest.fail "should reject overflow" (** Test signed/unsigned confusion. Signed length interpreted as huge unsigned value. *) let test_negative_length () = let data = Bytes.create 4 in Bytes.set_int32_be data 0 (-1l); (* 0xFFFFFFFF *) match Parser.read_with_length data with | Error (`Invalid_length _) -> () | _ -> Alcotest.fail "should reject negative length" ``` #### Buffer Boundaries ```ocaml (** Test out-of-bounds read. Claimed length exceeds actual data. *) let test_oob_read () = let header = "\x00\x00\x00\x10" in (* Claims 16 bytes *) let data = header ^ "short" in (* Only 5 bytes of payload *) match Parser.decode data with | Error (`Truncated _) -> () | Error _ -> Alcotest.fail "wrong error" | Ok _ -> Alcotest.fail "should reject truncated data" (** Test empty input handling. *) let test_empty_input () = match Parser.decode "" with | Error _ -> () (* Any error is acceptable *) | Ok _ -> Alcotest.fail "should reject empty input" ``` #### Denial of Service ```ocaml (** Test resource exhaustion - deeply nested structures. CVE pattern: ujson, many JSON parsers *) let test_deep_nesting () = let depth = 10000 in let nested = String.concat "" (List.init depth (fun _ -> "[")) ^ String.concat "" (List.init depth (fun _ -> "]")) in match Json.parse nested with | Error (`Nesting_too_deep _) -> () | Error _ -> () (* Resource errors are acceptable *) | Ok _ -> Alcotest.fail "should limit nesting depth" (** Test exponential expansion (billion laughs). CVE pattern: XML entity expansion *) let test_entity_expansion () = let malicious = "]>&a;&a;&a;..." in match Xml.parse malicious with | Error _ -> () | Ok _ -> Alcotest.fail "should reject entity expansion" ``` ### Phase 3: Fuzz Testing Add fuzz tests for all parsers and encoders. See the **fuzz** skill for patterns. **Priority targets:** 1. Binary protocol parsers (highest risk) 2. Text format parsers (JSON, XML, config files) 3. Cryptographic operations 4. Compression/decompression 5. Character encoding conversions ```ocaml (** Fuzz test: parser must not crash on any input. *) let test_decode_crash_safety buf = let buf = truncate buf in let _ = Parser.decode (to_bytes buf) in () (** Fuzz test: encoder output must be parseable. *) let test_roundtrip buf = let buf = truncate buf in match Parser.decode (to_bytes buf) with | Error _ -> () | Ok v -> let encoded = Parser.encode v in match Parser.decode encoded with | Error _ -> fail "encoded data failed to parse" | Ok v' -> if v <> v' then fail "roundtrip mismatch" ``` ### Phase 4: CVE Regression Tests For each applicable CVE, write a targeted regression test. ```ocaml (** CVE-2023-XXXX regression test. Reference: https://nvd.nist.gov/vuln/detail/CVE-2023-XXXX Integer overflow when calculating buffer size from untrusted width/height values. *) let test_cve_2023_xxxx () = let malicious_input = Bytes.of_string "\xff\xff\xff\xff\x00\x00\x00\x01" in match Image.decode malicious_input with | Error _ -> () | Ok _ -> Alcotest.fail "CVE-2023-XXXX: should reject overflow" ``` ## Defensive Coding Patterns ### Validate Early, Fail Fast ```ocaml let decode buf = (* Check minimum size before any parsing *) if Bytes.length buf < header_size then Error (`Truncated { expected = header_size; got = Bytes.length buf }) else let length = Bytes.get_int32_be buf 0 |> Int32.to_int in (* Validate length before allocating *) if length < 0 then Error (`Invalid_length length) else if length > max_message_size then Error (`Message_too_large { size = length; max = max_message_size }) else if Bytes.length buf < header_size + length then Error (`Truncated { expected = header_size + length; got = Bytes.length buf }) else parse_body buf length ``` ### Safe Integer Arithmetic (Hacker's Delight style) Overflow detection without branches where possible, using bit manipulation. ```ocaml (** Detect signed addition overflow (Hacker's Delight, 2-13). Overflow iff both operands have same sign and result has different sign. *) let add_overflow a b = let sum = a + b in (* (a ^ sum) & (b ^ sum) is negative iff overflow *) (a lxor sum) land (b lxor sum) < 0 (** Detect signed multiplication overflow (Hacker's Delight, 2-13). For non-negative operands: a * b overflows iff a > max_int / b *) let mul_overflow a b = if b = 0 then false else if b = -1 then a = Int.min_int (* Special case: min_int * -1 *) else if b > 0 then a > Int.max_int / b || a < Int.min_int / b else (* b < -1 *) a > Int.min_int / b || a < Int.max_int / b (** Safe addition with overflow check. *) let safe_add a b = if add_overflow a b then Error `Overflow else Ok (a + b) (** Safe multiplication with overflow check. *) let safe_mul a b = if mul_overflow a b then Error `Overflow else Ok (a * b) (** Unsigned comparison for signed integers (Hacker's Delight, 2-12). Interprets both values as unsigned. *) let unsigned_lt a b = (a lxor Int.min_int) < (b lxor Int.min_int) (** Check if value fits in n bits unsigned. *) let fits_in_bits n v = v >= 0 && v < (1 lsl n) (** Use in size calculations. *) let calculate_buffer_size ~width ~height ~bytes_per_pixel = let open Result.Syntax in let* row_size = safe_mul width bytes_per_pixel in let* total = safe_mul row_size height in if total > max_buffer_size then Error `Buffer_too_large else Ok total ``` ### Safe Integer Narrowing ```ocaml (** Safe conversion from int to int32. *) let int_to_int32 n = if n < Int32.(to_int min_int) || n > Int32.(to_int max_int) then Error `Overflow else Ok (Int32.of_int n) (** Safe conversion from int64 to int. *) let int64_to_int n = if n < Int64.of_int Int.min_int || n > Int64.of_int Int.max_int then Error `Overflow else Ok (Int64.to_int n) (** Convert length field to int, rejecting values that don't fit. *) let length_field_to_int len_field = let open Result.Syntax in let* n = int64_to_int len_field in if n < 0 then Error (`Invalid_length n) else Ok n ``` ### Constant-Time Comparison ```ocaml (** Constant-time string comparison for secrets. Prevents timing side-channels when comparing MACs, tokens, etc. *) let constant_time_equal a b = let len_a = String.length a in let len_b = String.length b in let result = ref (len_a lxor len_b) in for i = 0 to min len_a len_b - 1 do result := !result lor (Char.code a.[i] lxor Char.code b.[i]) done; !result = 0 ``` ### Resource Limits ```ocaml type limits = { max_depth : int; max_string_length : int; max_array_length : int; max_total_size : int; } let default_limits = { max_depth = 100; max_string_length = 10_000_000; max_array_length = 100_000; max_total_size = 100_000_000; } let parse ?(limits = default_limits) input = parse_with_limits ~limits input ``` ## Security Checklist Before releasing any parser, encoder, or protocol handler: ### Input Validation - [ ] Empty input rejected or handled correctly - [ ] Minimum size checked before parsing - [ ] Maximum size limits enforced - [ ] Length fields validated before use - [ ] Negative lengths rejected ### Integer Safety - [ ] Multiplication overflow checked in size calculations - [ ] Addition overflow checked in offset calculations - [ ] No signed/unsigned confusion in lengths - [ ] Cast results checked when narrowing ### Resource Limits - [ ] Nesting depth limited - [ ] Collection sizes limited - [ ] Total memory usage bounded - [ ] Recursion depth bounded ### Fuzz Testing - [ ] Crash-safety test with arbitrary bytes - [ ] Roundtrip test for encode/decode pairs - [ ] AFL campaign run for 24+ hours - [ ] No crashes or hangs found ### CVE Coverage - [ ] CVEs in similar libraries researched - [ ] Regression tests written for applicable CVEs - [ ] Tests documented with CVE references ## References - `references/vulnerability-classes.md` - Detailed patterns for each vulnerability class - **fuzz** skill - Comprehensive fuzz testing patterns - **ocaml-testing** skill - Unit test organization