# `.cv` File Format — Specification, version 1.0 > **Status:** Stable. Within this `MAJOR` (`1.x`) consumers MUST ignore unknown fields and continue rendering. Breaking changes require a new `MAJOR`. The key words **MUST**, **MUST NOT**, **SHOULD**, **SHOULD NOT**, **MAY**, **REQUIRED**, **RECOMMENDED**, and **OPTIONAL** in this document are to be interpreted as described in [BCP 14](https://www.rfc-editor.org/info/bcp14) ([RFC 2119](https://www.rfc-editor.org/rfc/rfc2119) and [RFC 8174](https://www.rfc-editor.org/rfc/rfc8174)) when, and only when, they appear in all capitals. ## 1. Introduction A `.cv` file bundles three coordinated representations of a single document, plus optional pre-computed embeddings: - A designed **PDF** — what humans see. - A clean **Markdown** copy — what bots, ATS systems, and LLMs read. - A self-contained **HTML** rendering — what web pages embed. - Optional **embeddings** — pre-computed vectors over the markdown so retrieval pipelines can skip an embedding pass. A `.cv` file **IS** a valid PDF. Existing PDF readers open it without modification. The other representations are carried inside the PDF as PDF/A-3 Associated Files (`/AF`). ### 1.1 Goals - One source of truth for a document, with three audience-specific representations always in sync. - Zero-install human fallback: the file opens in any PDF reader on day one. - Bot-ready: the markdown is trivial to extract for indexing, ATS parsing, and LLM context. - Web-embeddable: the HTML is self-contained and ready to drop into a `` web component. - AI-ready: optional pre-computed embeddings let RAG pipelines index the file without re-embedding. - Forward-compatible: unknown fields are ignored within the same major version. ### 1.2 Non-goals (in 1.0) - Encryption (revisit in 1.1). - Digital signatures (revisit in 1.x via PAdES). - Multi-document containers (one document per `.cv`). ### 1.3 Conformance levels - **`cv-strict`**: conforms to PDF/A-3u (ISO 19005-3, Unicode level) and satisfies every **MUST** in this spec. Conformance is verified with an ISO 19005-3 validator; [veraPDF](https://verapdf.org/) is the reference validator used by this project's CI. - **`cv-lenient`**: is a valid PDF, carries the `cv:version` XMP marker, and contains at least one valid content payload. Useful for environments where producing PDF/A-3u is impractical. Implementations **MUST** state which level they produce and **SHOULD** support reading both. ## 2. Terminology - **Producer** — software that writes a `.cv` file. - **Consumer** — software that reads a `.cv` file. - **Container** — the wrapping PDF. - **Payload** — a file embedded in the container via `/AF`. - **Primary payload** — the payload that consumers should treat as the canonical text representation when no other preference applies. Identified by `cv:primaryPayload`. - **Alternate payload** — a payload carrying the same content as the primary in another representation or language. - **Supplement payload** — a payload that adds material not present in the primary (cover letter, portfolio link list, etc.). ## 3. Container ### 3.1 PDF version A `.cv` file **MUST** be a valid PDF 1.7 or PDF 2.0 file. ### 3.2 PDF/A-3u conformance (`cv-strict`) A `cv-strict` file **MUST** conform to PDF/A-3u (ISO 19005-3, Unicode level). PDF/A-3u is the only PDF/A level that permits arbitrary embedded files and that guarantees Unicode text extraction. Practical implication: every font used in the visual PDF **MUST** be fully embedded (ISO 19005-3 § 6.2.11.4.1). Producers using standard 14 PDF base fonts by name (Helvetica, Times, Courier, Symbol, ZapfDingbats) will fail this requirement. Embedding is performed by the input-PDF generator, not by `.cv` packers — most modern producers (LaTeX/XeLaTeX, browser print-to-PDF, Word/LibreOffice export, headless Chromium) embed fonts by default. ### 3.3 Output intent A `cv-strict` file **SHOULD** include an embedded ICC profile (typically sRGB IEC61966-2.1) referenced from a `/OutputIntent`. ### 3.4 Forbidden constructs A `.cv` file **MUST NOT** contain: - PDF JavaScript actions (`/JS` or `/JavaScript`). - `/Launch` actions. - `/ImportData` actions. - `/SubmitForm` actions targeting any URI other than `mailto:`. - An `/Encrypt` dictionary (encryption is a non-goal in 1.0, see §1.2). - External stream references (`/F` filespecs pointing to files outside the container). Validators **MUST** reject files containing any of these. ## 4. Embedded payloads Payloads are carried as PDF Associated Files via the `/AF` entry on the document catalog (see ISO 32000-2 §14.13). ### 4.1 Filespec dictionary requirements Each `/Filespec` dictionary representing a `.cv` payload **MUST** set the following entries: | Entry | Type | Requirement | | --- | --- | --- | | `/Type` | name | `/Filespec` | | `/F` | string | filename in PDFDocEncoding | | `/UF` | string (UTF-16BE BOM) | filename in Unicode | | `/EF` | dict | embedded file dict with `/F` pointing to the stream | | `/Desc` | string | human-readable description | | `/AFRelationship` | name | one of `/Alternative`, `/Data`, `/Supplement` | | `/Subtype` | name | MIME type as a name (e.g. `/text#2Fmarkdown`) | The embedded file stream **MUST** include `/Params << /ModDate (...) /Size N /CheckSum >>` per the PDF spec. ### 4.2 Required content A `.cv` file **MUST** carry at least one of: - `resume.md` — `text/markdown; charset=UTF-8`, `/AFRelationship /Alternative`. **RECOMMENDED** as primary payload. - `resume.html` — `text/html; charset=UTF-8`, `/AFRelationship /Alternative`. The HTML **SHOULD** be self-contained (inline CSS, no external `