openapi: 3.1.0 info: title: Docling Serve REST API description: | HTTP service exposing the Docling document-parsing pipeline. Submit documents as URLs or uploads and receive a `DoclingDocument` together with optional Markdown, HTML, and text renditions. Synchronous endpoints return the converted document inline; the asynchronous endpoints return a task handle that can be polled, streamed over WebSocket, and fetched on completion. version: '1.0' license: name: MIT url: https://github.com/docling-project/docling-serve/blob/main/LICENSE contact: name: Docling Project url: https://github.com/docling-project/docling-serve servers: - url: http://localhost:5001 description: Local Docling Serve container. tags: - name: Convert description: Synchronous conversion endpoints. - name: Async description: Asynchronous conversion submission. - name: Tasks description: Task status, results, and streaming. - name: System description: Health and metadata. paths: /v1/convert/source: post: tags: - Convert summary: Convert Documents From Source URLs description: | Synchronously convert one or more documents pulled from HTTP source URLs or provided inline as base64. Returns the converted document(s) directly in the response body. operationId: convertSource requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ConvertSourceRequest' responses: '200': description: Conversion completed. content: application/json: schema: $ref: '#/components/schemas/ConvertResponse' application/zip: schema: type: string format: binary '400': $ref: '#/components/responses/Error' '500': $ref: '#/components/responses/Error' /v1/convert/file: post: tags: - Convert summary: Convert Documents From Uploaded Files description: | Synchronously convert one or more documents uploaded as multipart form data. Conversion options are supplied as additional form fields. operationId: convertFile requestBody: required: true content: multipart/form-data: schema: $ref: '#/components/schemas/ConvertFileForm' responses: '200': description: Conversion completed. content: application/json: schema: $ref: '#/components/schemas/ConvertResponse' application/zip: schema: type: string format: binary '400': $ref: '#/components/responses/Error' '500': $ref: '#/components/responses/Error' /v1/convert/source/async: post: tags: - Async summary: Submit Source Conversion Asynchronously description: | Submit a source-based conversion job to the async queue. Returns a `TaskDetail` with the queue position and a `task_id` for subsequent polling. operationId: convertSourceAsync requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ConvertSourceRequest' responses: '200': description: Job accepted. content: application/json: schema: $ref: '#/components/schemas/TaskDetail' '400': $ref: '#/components/responses/Error' /v1/convert/file/async: post: tags: - Async summary: Submit File Conversion Asynchronously description: | Submit an upload-based conversion job to the async queue. Returns a `TaskDetail` for polling. operationId: convertFileAsync requestBody: required: true content: multipart/form-data: schema: $ref: '#/components/schemas/ConvertFileForm' responses: '200': description: Job accepted. content: application/json: schema: $ref: '#/components/schemas/TaskDetail' '400': $ref: '#/components/responses/Error' /v1/status/poll/{task_id}: get: tags: - Tasks summary: Poll Asynchronous Task Status description: Return the current `TaskDetail` for the task identified by `task_id`. operationId: pollTaskStatus parameters: - name: task_id in: path required: true schema: type: string responses: '200': description: Current task status. content: application/json: schema: $ref: '#/components/schemas/TaskDetail' '404': $ref: '#/components/responses/Error' /v1/result/{task_id}: get: tags: - Tasks summary: Get Asynchronous Task Result description: Return the conversion result for a completed asynchronous task. operationId: getTaskResult parameters: - name: task_id in: path required: true schema: type: string responses: '200': description: Conversion result. content: application/json: schema: $ref: '#/components/schemas/ConvertResponse' application/zip: schema: type: string format: binary '404': $ref: '#/components/responses/Error' '409': description: Task is not yet complete. content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' /health: get: tags: - System summary: Service Health Check description: Liveness/readiness probe for Docling Serve. operationId: getHealth responses: '200': description: Service is healthy. content: application/json: schema: type: object properties: status: type: string example: ok /openapi.json: get: tags: - System summary: Get OpenAPI Specification description: Returns the live OpenAPI specification for the running Docling Serve instance. operationId: getOpenApiSpec responses: '200': description: OpenAPI document. content: application/json: schema: type: object components: responses: Error: description: Error response. content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' schemas: ConvertSourceRequest: type: object description: Request body for source-based document conversion. properties: http_sources: type: array description: HTTP/HTTPS URLs to fetch and convert. items: $ref: '#/components/schemas/HttpSource' file_sources: type: array description: Inline base64-encoded documents. items: $ref: '#/components/schemas/FileSource' options: $ref: '#/components/schemas/ConvertDocumentsOptions' target: $ref: '#/components/schemas/Target' HttpSource: type: object required: - url properties: url: type: string format: uri headers: type: object additionalProperties: type: string FileSource: type: object required: - base64_string - filename properties: base64_string: type: string description: Base64-encoded file content. filename: type: string description: Original filename, used for format detection. ConvertFileForm: type: object properties: files: type: array items: type: string format: binary from_formats: type: array items: type: string to_formats: type: array items: type: string image_export_mode: type: string enum: [embedded, placeholder, referenced] do_ocr: type: boolean ocr_engine: type: string force_ocr: type: boolean ocr_lang: type: array items: type: string pdf_backend: type: string table_mode: type: string enum: [fast, accurate] do_table_structure: type: boolean include_images: type: boolean images_scale: type: number return_as_file: type: boolean ConvertDocumentsOptions: type: object description: Conversion behavior knobs shared by sync and async endpoints. properties: from_formats: type: array description: Input formats to accept. items: type: string enum: [pdf, docx, pptx, xlsx, html, md, asciidoc, image, audio, csv, xml_uspto, xml_jats] to_formats: type: array description: Output formats to produce. items: type: string enum: [md, html, json, text, doctags] image_export_mode: type: string enum: [embedded, placeholder, referenced] do_ocr: type: boolean force_ocr: type: boolean ocr_engine: type: string enum: [easyocr, tesseract, tesseract_cli, rapidocr, mac_ocr, ocrmac] ocr_lang: type: array items: type: string pdf_backend: type: string enum: [dlparse_v1, dlparse_v2, pypdfium2] table_mode: type: string enum: [fast, accurate] do_table_structure: type: boolean do_code_enrichment: type: boolean do_formula_enrichment: type: boolean do_picture_classification: type: boolean do_picture_description: type: boolean picture_description_area_threshold: type: number include_images: type: boolean images_scale: type: number pipeline: type: string enum: [standard, vlm] vlm_model: type: string return_as_file: type: boolean abort_on_error: type: boolean Target: type: object description: Optional delivery target for the converted output. properties: kind: type: string enum: [inbody, zip, s3, http] zip_file_name: type: string ConvertResponse: type: object properties: document: $ref: '#/components/schemas/DoclingDocumentRendering' status: type: string enum: [success, partial_success, failure] errors: type: array items: type: object processing_time: type: number timings: type: object additionalProperties: type: number DoclingDocumentRendering: type: object description: Container of one or more renderings of the converted document. properties: filename: type: string md_content: type: string html_content: type: string json_content: $ref: '#/components/schemas/DoclingDocument' text_content: type: string doctags_content: type: string DoclingDocument: type: object description: Canonical Docling document representation. See docling-core for the full schema. properties: schema_name: type: string version: type: string name: type: string origin: type: object furniture: type: array items: type: object body: type: object groups: type: array items: type: object texts: type: array items: type: object tables: type: array items: type: object pictures: type: array items: type: object key_value_items: type: array items: type: object pages: type: object TaskDetail: type: object description: Async task descriptor returned by submit/poll/status endpoints. required: - task_id - task_status properties: task_id: type: string task_status: type: string enum: [pending, started, success, failure, revoked] task_position: type: integer description: Position in the queue (0 = currently running). task_meta: type: object additionalProperties: true created_at: type: string format: date-time started_at: type: string format: date-time finished_at: type: string format: date-time ErrorResponse: type: object properties: detail: type: string code: type: string