openapi: 3.0.3 info: version: '1.0' title: Cambrion API description: | The official Cambrion API specification. To receive a free API key reach out at hello@cambrion.ai with brief description of your use-case. servers: - url: http://localhost:8080 - url: https://api.cambrion.io/v1 tags: - name: Executions description: Execution environment that store results. externalDocs: description: Find out more url: https://docs.cambrion.io/docs/workflows/executions - name: Pipelines description: Machine learning pipelines. externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs/workflows/pipeline - name: Extractions description: Extraction definitions. externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs - name: Indices description: Indices externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs/workflows/linker - name: Hooks description: Webhook and notification hooks for execution events. externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs/webhooks paths: /executions: post: summary: Creates an execution description: | Create an execution from an ID (optional). If an execution ID is given that ID will be used, otherwise a new one is created. If the execution already exists, it will be ignored and 204 will be returned. If a new execution was created, 200 is returned with the execution ID as body. operationId: executions/create-execution tags: - Executions requestBody: $ref: '#/components/requestBodies/ExecutionRequest' responses: '200': $ref: '#/components/responses/ExecutionResponse' '204': description: Execution existing '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' get: summary: Gets all executions tags: - Executions operationId: executions/get-all-executions parameters: - $ref: '#/components/parameters/ExecutionTag' responses: '200': $ref: '#/components/responses/ExecutionsResponse' '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' /executions/{executionId}: get: summary: Gets execution tags: - Executions operationId: executions/get-execution parameters: - $ref: '#/components/parameters/ExecutionId' responses: '200': $ref: '#/components/responses/ExecutionResponse' '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' delete: summary: Deletes an execution tags: - Executions operationId: executions/delete-execution parameters: - $ref: '#/components/parameters/ExecutionId' responses: '204': description: Deleted '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/media: post: summary: Add media to an observation tags: - Executions operationId: executions/add-media parameters: - $ref: '#/components/parameters/ExecutionId' requestBody: content: image/jpeg: schema: type: string format: base64 image/png: schema: type: string format: base64 application/pdf: schema: type: string format: base64 responses: '200': $ref: '#/components/responses/MediaIdResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/media/{mediaId}: get: summary: Retrieve a specific media operationId: executions/get-media tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' - $ref: '#/components/parameters/MediaId' responses: '200': $ref: '#/components/responses/MediaResponse' /executions/{executionId}/observation: post: summary: Merge a raw observation into the current observation description: The raw observation is merged into the current observation context. operationId: executions/create-observation parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: $ref: '#/components/requestBodies/Observation' responses: '204': description: Update observation '400': description: Invalid observation content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' get: summary: Get observation description: Get a full observation of the execution. operationId: executions/get-observation parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions responses: '200': $ref: '#/components/responses/ObservationResponse' '400': description: Bad request content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/transform: post: summary: Transform an observation description: Transform a raw observation into an object using a JSONata statement. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html operationId: executions/transform-observation parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: content: text/plain: schema: type: string responses: '200': $ref: '#/components/responses/TransformerResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/json: get: summary: Transform an observation into JSON description: | Transform a raw observation into the corresponding JSON object. The values in the JSON object correspond to the data values in the observation. If data values are not available, the raw text is used. operationId: executions/transform-observation-json parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions responses: '200': $ref: '#/components/responses/JsonResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/refinements: post: summary: Submit refinements to an observation description: | Submit user corrections to extracted data. Refinements can target entities, key-value pairs, tables, or key-value sets by their IDs. Refinements are additive - original data is preserved. operationId: executions/submit-refinements tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' requestBody: content: application/json: schema: $ref: '#/components/schemas/RefinementRequest' responses: '204': description: Refinements applied successfully '400': description: Invalid refinement request content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' get: summary: Get all refinements for an observation description: Retrieve a summary of all refinements that have been applied to the observation. operationId: executions/get-refinements tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' responses: '200': description: Refinements retrieved successfully content: application/json: schema: $ref: '#/components/schemas/RefinementSummary' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/link: post: summary: Link results of an execution description: Link contents of observation to documents in an index. operationId: executions/link-execution parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: $ref: '#/components/requestBodies/LinkerRequest' responses: '200': $ref: '#/components/responses/LinkerResponse' '400': description: Bad linker parameters content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/submit: post: summary: Submit an execution description: | Triggers all SUBMIT-type hooks attached to this execution. This endpoint is used to signal that an execution is ready for external processing or notification. Returns the exact payloads that were sent (or would be sent) to each hook endpoint. operationId: executions/submit-execution tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' responses: '200': description: Submit hooks triggered successfully content: application/json: schema: $ref: '#/components/schemas/SubmitResponse' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/retry: post: summary: Retry an execution description: | Retry a failed or completed execution with a specified pipeline. This updates the existing execution, resetting its status to PENDING and scheduling it for async processing. Only executions with status ERROR or COMPLETED can be retried. operationId: executions/retry-execution tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' requestBody: $ref: '#/components/requestBodies/RetryExecutionRequest' responses: '200': $ref: '#/components/responses/RetryExecutionResponse' '400': description: Invalid request or execution status does not allow retry content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /hooks: post: summary: Create a hook description: | Create a hook that will be triggered when specific events occur on executions. Hooks can notify external services via webhooks when execution status changes, observations are updated, or when submit events are triggered. operationId: hooks/create-hook tags: - Hooks requestBody: $ref: '#/components/requestBodies/HookRequest' responses: '200': $ref: '#/components/responses/HookResponse' '400': description: Invalid hook configuration content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' get: summary: Get all hooks description: Retrieve all configured hooks. operationId: hooks/get-hooks tags: - Hooks responses: '200': $ref: '#/components/responses/HooksResponse' '401': $ref: '#/components/responses/Unauthorized' /hooks/{hookId}: get: summary: Get a specific hook description: Retrieve a specific hook by its ID. operationId: hooks/get-hook tags: - Hooks parameters: - $ref: '#/components/parameters/HookId' responses: '200': $ref: '#/components/responses/HookResponse' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' put: summary: Update a hook description: Update an existing hook configuration. operationId: hooks/update-hook tags: - Hooks parameters: - $ref: '#/components/parameters/HookId' requestBody: $ref: '#/components/requestBodies/HookRequest' responses: '200': $ref: '#/components/responses/HookResponse' '400': description: Invalid hook configuration content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete a hook description: Delete a hook. operationId: hooks/delete-hook tags: - Hooks parameters: - $ref: '#/components/parameters/HookId' responses: '204': description: Hook deleted successfully '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines: get: summary: Get all deployed pipelines operationId: pipelines/get-deployed-pipelines tags: - Pipelines responses: '200': description: Get list of all available pipelines content: application/json: schema: type: array items: $ref: '#/components/schemas/Pipeline' '401': $ref: '#/components/responses/Unauthorized' post: summary: Create new pipeline operationId: pipelines/create-pipeline tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineRequest' responses: '200': description: Pipeline created content: application/json: schema: type: object properties: pipelineId: type: string '400': description: Invalid pipeline. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}: get: summary: Get a specific pipeline tags: - Pipelines operationId: pipelines/get-pipeline parameters: - $ref: '#/components/parameters/PipelineId' responses: '200': $ref: '#/components/responses/PipelineResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' put: summary: Update an existing pipeline tags: - Pipelines operationId: pipelines/update-pipeline parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineRequest' responses: '200': $ref: '#/components/responses/PipelineResponse' '400': description: Invalid pipeline. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete a specific pipeline tags: - Pipelines operationId: pipelines/delete-pipeline parameters: - $ref: '#/components/parameters/PipelineId' responses: '204': description: Deleted '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/definition: get: summary: Get graph representation (definition) of a pipeline tags: - Pipelines operationId: pipelines/get-pipeline-definition parameters: - $ref: '#/components/parameters/PipelineId' responses: '200': $ref: '#/components/responses/PipelineDefinitionResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync: post: summary: Execute pipeline synchronously description: | Execute a pipeline synchronously and return the corresponding observation. The timeout is 30 seconds. If the computation takes longer than the timeout period a timeout error will be returned. operationId: pipelines/execute-pipeline-sync tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/PipelineSyncResponse' '400': description: Invalid execution request (i.e. Execution not found!). content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync/transform: post: summary: Transform an observation description: | Execute a pipeline synchronously and return the transformed observation operationId: pipelines/execute-pipeline-sync-transform parameters: - $ref: '#/components/parameters/PipelineId' tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/TransformerResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync/json: post: summary: Transform an observation description: | Execute a pipeline synchronously and return the corresponding JSON object. operationId: pipelines/execute-pipeline-sync-transform-json parameters: - $ref: '#/components/parameters/PipelineId' tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/JsonResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeAsync: post: summary: Execute pipeline asynchronously tags: - Pipelines operationId: pipelines/execute-pipeline-async parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/PipelineAsyncResponse' /models: get: summary: Get all registered models tags: - Models operationId: models/get-models responses: '200': description: Get list of all registered models content: application/json: schema: type: array items: type: object properties: name: type: string example: receipt-pipeline state: type: string description: $ref: '#/components/schemas/ModelDescription' config: type: object additionalProperties: true post: summary: Register an uploaded model (currently internal only) tags: - Models operationId: models/register-models requestBody: $ref: '#/components/requestBodies/ModelRequest' responses: '200': description: Deployment successful '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /models/{modelName}/copy: post: summary: Copy a registered model tags: - Models operationId: models/copy-model parameters: - $ref: '#/components/parameters/ModelName' requestBody: $ref: '#/components/requestBodies/ModelCopyRequest' responses: '200': description: Deployment successful '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices: get: summary: Get all indices description: Get a list of all indices. operationId: indices/get-indices tags: - Indices parameters: - $ref: '#/components/parameters/IndexLimit' - $ref: '#/components/parameters/IndexOffset' responses: '200': description: A list of all indices content: application/json: schema: type: array items: $ref: '#/components/schemas/Index' post: summary: Create index description: Create a new index with an optional schema. operationId: indices/create-index tags: - Indices requestBody: $ref: '#/components/requestBodies/IndexRequest' responses: '204': description: Index created '400': description: Invalid invalid request content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' /indices/{indexId}: get: summary: Get index description description: | Get the description of an index. Including the data model if present. operationId: indices/get-index tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' responses: '200': $ref: '#/components/responses/IndexResponse' '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete index description: Delete an index. operationId: indices/delete-index tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' responses: '204': description: Index deleted. '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/query: post: summary: Query an index with a search string description: | Query an index with a search string operationId: indices/query-index tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' requestBody: $ref: '#/components/requestBodies/QueryRequest' responses: '200': $ref: '#/components/responses/QueryResponse' '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/documents: get: summary: Get all documents tags: - Indices operationId: indices/get-documents parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentLimit' - $ref: '#/components/parameters/DocumentOffset' responses: '200': description: A list of all indices content: application/json: schema: type: array items: $ref: '#/components/schemas/IndexedDocument' post: summary: Create document description: | Create a JSON document in an index. If the index does not exist it will be created automatically. operationId: indices/create-documents tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' requestBody: $ref: '#/components/requestBodies/DocumentRequest' responses: '204': description: Document stored successfully '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/documents/batches: post: summary: Create document description: | Create a JSON document in an index. If the index does not exist it will be created automatically. operationId: indices/create-documents-batches tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' requestBody: $ref: '#/components/requestBodies/DocumentBatchRequest' responses: '204': description: Document stored successfully '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/documents/{documentId}: get: summary: Get document tags: - Indices operationId: indices/get-document parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentId' responses: '200': $ref: '#/components/responses/DocumentResponse' '400': description: Invalid index ID or document ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete document tags: - Indices operationId: indices/delete-document parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentId' responses: '204': description: Document deleted '400': description: Invalid index ID or document ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /extractions: get: summary: Get all extractions tags: - Extractions operationId: extractions/get-extractions responses: '200': description: A list of all extractions content: application/json: schema: type: array items: $ref: '#/components/schemas/ExtractionItem' post: summary: Create extraction description: | Create a document extraction. This automatically creates a pipeline that corresponds to the instructions in the extraction. operationId: extractions/create-extraction tags: - Extractions requestBody: $ref: '#/components/requestBodies/ExtractionRequest' responses: '200': $ref: '#/components/responses/ExtractionResponse' '400': description: Invalid extraction content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /extractions/{extractionId}: get: summary: Get specific extraction tags: - Extractions parameters: - $ref: '#/components/parameters/ExtractionId' operationId: extractions/get-extraction responses: '200': $ref: '#/components/responses/ExtractionItemResponse' '400': description: Invalid extraction ID or extraction body content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' put: summary: Update extraction description: | Update an existing extraction and the corresponding pipeline. operationId: extractions/update-extraction tags: - Extractions parameters: - $ref: '#/components/parameters/ExtractionId' requestBody: $ref: '#/components/requestBodies/ExtractionRequest' responses: '200': $ref: '#/components/responses/ExtractionResponse' '400': description: Invalid extraction ID or extraction body content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete extraction tags: - Extractions operationId: extractions/delete-extraction parameters: - $ref: '#/components/parameters/ExtractionId' responses: '204': description: Extraction deleted '400': description: Invalid extraction ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /extractions/{extractionId}/suggest: post: summary: Suggest changes to an extraction description: | Suggest changes to an extraction tags: - Extractions parameters: - $ref: '#/components/parameters/ExtractionId' requestBody: $ref: '#/components/requestBodies/SuggestionRequest' responses: '200': $ref: '#/components/responses/ExtractionItemResponse' '400': description: Invalid extraction ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /extractions/{extractionId}/suggest/async: post: summary: Suggest changes to an extraction asynchronously description: | Suggest changes to an extraction asynchronously tags: - Extractions parameters: - $ref: '#/components/parameters/ExtractionId' requestBody: $ref: '#/components/requestBodies/SuggestionRequest' responses: '200': $ref: '#/components/responses/SuggestAsyncResponse' '400': description: Invalid extraction ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /extractions/{extractionId}/improve: post: summary: Improve an extraction with an example description: | Improve an extraction with an example tags: - Extractions parameters: - $ref: '#/components/parameters/ExtractionId' requestBody: $ref: '#/components/requestBodies/FeedbackRequest' responses: '204': description: Example stored successfully '400': description: Invalid extraction ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' components: requestBodies: PipelineExecutionRequest: description: Execution request for a pipeline content: application/json: schema: $ref: '#/components/schemas/PipelineExecutionObject' RetryExecutionRequest: description: Request to retry an execution with a specified pipeline content: application/json: schema: $ref: '#/components/schemas/RetryExecutionRequestBody' ExecutionRequest: description: Execution is the context which holds data related to a specific execution content: application/json: schema: $ref: '#/components/schemas/Execution' SuggestionRequest: description: Suggest a change to an existing extraction via natural language or examples content: application/json: schema: $ref: '#/components/schemas/Suggestion' FeedbackRequest: description: Add feedback via examples content: application/json: schema: $ref: '#/components/schemas/Feedback' ModelRequest: description: Model request content: application/json: schema: type: object properties: name: type: string example: receipt-pipeline description: $ref: '#/components/schemas/ModelDescription' config: type: object additionalProperties: true ModelCopyRequest: description: Model Copy request content: application/json: schema: type: object properties: newName: type: string example: receipt-pipeline newDescription: $ref: '#/components/schemas/ModelDescription' newConfig: type: object additionalProperties: true PipelineRequest: description: Pipeline request content: application/json: schema: type: object properties: pipelineId: type: string example: receipt-pipeline name: type: string example: receipt-pipeline deploy: type: boolean default: true description: Whether to deploy the pipeline when creating/updating it description: type: string example: A pipeline to extract contents from a receipt tag: type: string version: type: integer example: 1 pipelineDefinition: $ref: '#/components/schemas/PipelineDefinition' Observation: description: Observation request content: application/json: schema: $ref: '#/components/schemas/Observation' MediaRequest: description: Observation request content: image/png: schema: type: string format: binary application/pdf: schema: type: string format: binary text/plain: schema: type: string LinkerRequest: description: Linker request content: application/json: schema: $ref: '#/components/schemas/EntityLinkerConfig' IndexRequest: description: Index creation request content: application/json: schema: $ref: '#/components/schemas/Index' ExtractionRequest: description: Extraction request content: application/json: schema: type: object properties: description: type: string description: The description of the extraction compact: type: boolean default: false description: Faster response but no confidences highPrecision: type: boolean default: false description: Used for higher precision but slower response size: type: number default: 1300 description: Image resolution in px. Higher leads to better precision but slower response. parallelProcessing: type: boolean default: false description: | Pages will be processed in parallel. This leads to lower latency but context between pages will be lost. intelligentBatching: type: boolean default: false description: | The AI will batch as many pages together as possible. This allows understanding content across pages. Will be ignored if parallelProcessing is true. targetModel: type: string generationInstruct: type: object description: The instruct JSON expression that is used to describe the extraction. additionalProperties: true hookIds: type: array items: type: string description: List of hook IDs to attach to executions created from this extraction. DocumentRequest: description: Document request content: application/json: schema: $ref: '#/components/schemas/IndexedDocument' DocumentBatchRequest: description: Document batch request content: application/json: schema: type: array items: $ref: '#/components/schemas/IndexedDocument' QueryRequest: description: Query request content: application/json: schema: $ref: '#/components/schemas/Query' HookRequest: description: Hook creation or update request content: application/json: schema: $ref: '#/components/schemas/Hook' responses: ExecutionResponse: description: Response of an execution content: application/json: schema: $ref: '#/components/schemas/Execution' ExecutionsResponse: description: Response of an execution content: application/json: schema: type: array items: $ref: '#/components/schemas/Execution' RetryExecutionResponse: description: Response containing the execution ID of the retried execution content: application/json: schema: type: object properties: executionId: type: string description: ID of the execution that was retried ExtractionResponse: description: Response of an execution content: application/json: schema: type: object properties: pipelineId: type: string ExtractionItemResponse: description: Response of an execution content: application/json: schema: $ref: '#/components/schemas/ExtractionItem' PipelineResponse: description: Response to the creation of a new pipeline content: application/json: schema: type: object properties: pipeline: $ref: '#/components/schemas/Pipeline' pipelineDefinition: $ref: '#/components/schemas/PipelineDefinition' PipelineStatusResponse: description: Status response of pipeline content: application/json: schema: $ref: '#/components/schemas/Pipeline' PipelineDefinitionResponse: description: Pipeline in definition in YAML content: application/json: schema: $ref: '#/components/schemas/PipelineDefinition' PipelineSyncResponse: description: Response of synchronous pipeline execution content: application/json: schema: type: object properties: executionId: type: string observation: $ref: "#/components/schemas/Observation" PipelineDeploymentResponse: description: Response of pipeline deployment content: application/json: schema: type: object properties: pipelineDeploymentId: type: string PipelineAsyncResponse: description: Response of asynchronous pipeline execution content: application/json: schema: type: object properties: executionId: type: string SuggestAsyncResponse: description: Response of asynchronous suggestion request content: application/json: schema: type: object properties: executionId: type: string ModelResponse: description: Response to a model request content: application/json: schema: type: object properties: modelName: type: string MediaResponse: description: Specific media content: image/*: schema: type: string format: binary application/pdf: schema: type: string format: binary text/plain: schema: type: string MediaIdResponse: description: Update media ID content: application/json: schema: type: object properties: mediaId: type: string ObservationResponse: description: Observation response content: application/json: schema: $ref: "#/components/schemas/Observation" LinkerResponse: description: Linker response content: application/json: schema: type: array items: $ref: '#/components/schemas/LinkedDocument' DocumentResponse: description: Document response content: application/json: schema: $ref: '#/components/schemas/IndexedDocument' TransformerResponse: description: | Response of after transformation. The output can be an arbitrary object created by a JSONata expression. It must be deserialized. content: text/plain: schema: type: string JsonResponse: description: | Response of after the observation was transformed to the corresponding JSON object. content: application/json: schema: additionalProperties: true IndexResponse: description: Response of after transformation content: application/json: schema: $ref: '#/components/schemas/Index' QueryResponse: description: List of documents that matched the query content: application/json: schema: type: array items: $ref: '#/components/schemas/IndexedDocument' HookResponse: description: Response containing a single hook content: application/json: schema: $ref: '#/components/schemas/Hook' HooksResponse: description: Response containing a list of hooks content: application/json: schema: type: array items: $ref: '#/components/schemas/Hook' NotFound: description: The specified resource was not found content: application/json: schema: $ref: '#/components/schemas/Error' Unauthorized: description: Unauthorized content: application/json: schema: $ref: '#/components/schemas/Error' PipelineAlreadyDeployed: description: The pipeline to be deployed was already deployed content: application/json: schema: $ref: '#/components/schemas/Pipeline' parameters: PipelineId: name: pipelineId in: path description: ID of the pipeline to execute required: true schema: type: string ModelId: name: modelId in: path description: ID of the model required: true schema: type: string ModelName: name: modelName in: path description: Name of the model required: true schema: type: string DeploymentId: name: deploymentId in: path description: ID of the pipeline deployment required: true schema: type: string ExecutionId: name: executionId in: path description: ID of an execution required: true schema: type: string MediaId: name: mediaId in: path description: ID of uploaded media required: true schema: type: string ExtractionId: name: extractionId in: path description: ID of an extraction required: true schema: type: string HookId: name: hookId in: path description: ID of a hook required: true schema: type: string IndexId: name: indexId in: path description: ID of the index required: true schema: type: string example: Warehouse-Index DocumentId: name: documentId in: path description: ID of a document required: true schema: type: string ExecutionTag: name: tag in: query description: Filter executions by tag schema: type: string IndexLimit: name: limit in: query description: Limits the number of indices on a page schema: type: integer IndexOffset: name: offset in: query description: Specifies the page number of the indices to be displayed schema: type: integer DocumentLimit: name: limit in: query description: Limits the number of documents on a page schema: type: integer DocumentOffset: name: offset in: query description: Specifies the page number of the documents to be displayed schema: type: integer schemas: ModelDescription: type: object properties: name: type: string description: type: string version: type: integer resourceTag: type: string RetryExecutionRequestBody: type: object required: - pipelineId properties: pipelineId: type: string description: ID of the pipeline to execute for the retry ExtractionItem: properties: extractionId: type: string description: type: string state: type: string description: The state of the extraction. It can take up to 20 seconds until the extraction is ready. compact: type: boolean default: false description: Faster response but no confidences highPrecision: type: boolean default: false description: Used for higher precision but slower response size: type: number default: 1300 description: Image resolution in px. Higher leads to better precision but slower response. parallelProcessing: type: boolean default: false description: | Pages will be processed in parallel. This leads to lower latency but context between pages will be lost. intelligentBatching: type: boolean default: false description: | The AI will batch as many pages together as possible. This allows understanding content across pages. Will be ignored if parallelProcessing is true. targetModel: type: string generationInstruct: type: object additionalProperties: true hookIds: type: array items: type: string description: | List of hook IDs that will be associated with executions using this extraction. These hooks will fire on status changes, observation updates, and submit events. GenerationInstruct: type: object additionalProperties: true InstructTree: type: object additionalProperties: true GenerationField: type: object properties: fieldName: type: string fieldKey: type: string description: type: string examples: type: array items: type: string type: type: string default: "STRING" enum: - STRING - INTEGER - BOOLEAN - DATE - FLOAT - ENUM GenerationKeyValueSet: type: object properties: description: type: string set: type: object additionalProperties: "$ref": "#/components/schemas/GenerationField" GenerationTable: type: object properties: key: type: string description: type: string columns: type: object additionalProperties: "$ref": "#/components/schemas/GenerationKeyValueSet" Pipeline: type: object properties: pipelineId: type: string description: ID of the pipeline name: type: string description: Name of the pipeline description: type: string description: Description of the pipeline tag: type: string description: A tag that can be used to group pipelines status: type: string description: Current status of the pipeline version: type: integer description: Version of the pipeline PipelineDefinition: type: object properties: pipelineDefinitionId: type: string example: receipt-pipeline-definition nodes: type: array description: The nodes of graph that describes pipeline items: "$ref": "#/components/schemas/PipelineNode" example: - modelId: ocr_recognizer modelName: ocr_recognizer modelVersion: 1 modelParameters: param1: 1 param2: 2 canvas: position: x: 0 y: 250 inputs: inputName: info_array_ocr_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_ocr_output inputShape: - 1 inputType: STRING - modelId: static_layout_recognizer modelName: static_layout_recognizer modelVersion: 1 modelParameters: targetModel: some_model labels: - date - name - amount canvas: position: x: 0 y: 500 inputs: inputName: info_array_static_layout_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_static_layout_output inputShape: - 1 inputType: STRING - modelId: entity_parser modelName: entity_parser modelVersion: 1 modelParameters: date: DATE name: STRING amount: NUMBER canvas: position: x: 0 y: 750 inputs: inputName: info_array_parser_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_parser_output inputShape: - 1 inputType: STRING - modelId: entity_deduplicator modelName: entity_deduplicator modelVersion: 1 modelParameters: keys: - date - name - amount canvas: position: x: 0 y: 1000 inputs: inputName: info_array_deduplicator_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_deduplicator_output inputShape: - 1 inputType: STRING edges: type: array description: The edges of graph that describes pipeline items: "$ref": "#/components/schemas/PipelineEdge" example: - id: edge-1 dataHandle: ocr_result source: ocr_recognizer target: layout_recognizer sourceHandle: info_array_ocr_output targetHandle: info_array_static_layout_input - id: edge-2 dataHandle: recognizer_result source: layout_recognizer target: entity_parser sourceHandle: info_array_static_layout_output targetHandle: info_array_parser_input - id: edge-3 dataHandle: parser_result source: entity_parser target: entity_deduplicator sourceHandle: info_array_parser_output targetHandle: info_array_deduplicator_input PipelineEdge: type: object description: | An edge in the graph describing the pipeline. Multiple edges can originate from a single node. But only a single edge can end at a target. To have multiple edges end up at a single each edge must be assigned to a corresponding input. properties: id: type: string description: The ID of the edge. dataHandle: type: string description: | Name of the handle that holds the data transferred from one node to another. This needs to be used when data goes from one node to multiple other nodes. source: type: string description: Model ID of the source node target: type: string description: Model ID of the target node sourceHandle: type: string description: Name of the output variable of the source node. targetHandle: type: string description: Name of the input variable of the target node. PipelineNode: type: object description: | A node of the graph describing the pipeline. The node corresponds to a single execution of a model. The graph can only have a single input node. The identity model (identity_model) can be used to fan-out the input. properties: modelId: type: string description: ID of the model modelName: type: string description: Name of the model modelVersion: type: integer default: -1 description: Version of the model canvas: type: object description: Properties used to display the node properties: position: type: object properties: x: type: integer y: type: integer modelParameters: type: object additionalProperties: true description: The parameters that will be provided to the underlying model at inference time inputs: type: array description: Inputs of the model. These depend of the implementation. items: $ref: '#/components/schemas/ModelInput' outputs: type: array description: Outputs of the model. These depend of the implementation. items: $ref: '#/components/schemas/ModelOutput' ModelInput: type: object properties: inputName: type: string description: Variable name of the input inputShape: type: array items: type: integer inputType: type: string default: "STRING" enum: - STRING - INT_TENSOR - FLOAT_TENSOR ModelOutput: type: object properties: outputName: type: string description: Variable name of the output outputShape: type: array items: type: integer outputType: type: string default: "STRING" enum: - STRING - INT_TENSOR - FLOAT_TENSOR Execution: type: object description: | The execution is a stateful environment in which media (such as images or PDF files) can be stored an used an inputs for pipelines. The ID in the request body is optional (generated if empty) and must be unique. properties: executionId: type: string description: ID of the execution tag: type: string description: Tag to identify the execution createdAt: type: string description: Creation time completedAt: type: string description: Completion time duration: type: number description: Duration in seconds status: type: string description: Status of the current execution. Includes error message in case of error. metaData: type: object additionalProperties: true hookIds: type: array items: type: string description: | Optional list of hook IDs to trigger for this execution. The referenced hooks will be notified when events occur on this execution (status changes, observation updates, etc.). FeedbackExample: type: object description: | Feedback example properties: media: description: | Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. items: type: string format: base64 type: array example: "$ref": "#/components/schemas/Observation" Feedback: type: object description: | Feedback properties: note: type: string example: "$ref": "#/components/schemas/FeedbackExample" Suggestion: type: object properties: media: description: | Array of base 64 encoded media files. The media files will be used to derive a suggestion for a possible underlying extraction. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. items: type: string format: base64 type: array documentContext: type: string instruction: type: string feedback: type: array items: "$ref": "#/components/schemas/Feedback" PipelineExecutionObject: type: object description: | The execution is a stateful environment in which media (such as images or PDF files) can be stored an used an inputs for pipelines. The ID in the request body is optional (generated if empty) and must be unique. Nothing will be persisted if transient is true. In order to trigger the pipeline either an execution ID containing valid media or base 64 encoded media under the media property have to be provided. properties: executionId: type: string description: ID of the execution tag: type: string description: Tag used to identify the resulting execution. Ignored if transient is true. transient: type: boolean description: Whether to delete all execution data after pipeline completion transform: type: string description: | JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html tryImageConversion: type: boolean description: | DEPRECATED: Tries to convert the provided content to an image (e.g. PDF) default: false trySimpleText: type: boolean description: | DEPRECATED: Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html default: false idempotent: type: boolean description: Whether to update the existing observation with the results from pipeline run (always true if executionId is null) default: false media: description: | Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. items: type: string format: base64 type: array runtimeParameters: type: object additionalProperties: true description: Not active yet! observation: "$ref": "#/components/schemas/Observation" text: type: string description: Raw text that can be used as input in pipelines hookIds: type: array items: type: string description: | Optional list of hook IDs to trigger for this pipeline execution. The referenced hooks will be notified when events occur on the resulting execution (status changes, observation updates, etc.). entryPoint: type: string EntityLinkerConfig: properties: group: items: "$ref": "#/components/schemas/MatchGroup" type: array document: type: object additionalProperties: true topk: items: "$ref": "#/components/schemas/TopKIndexFilter" type: array type: object title: Entity Link Request Index: type: object properties: indexId: type: string description: Unique ID of the index. semanticSearchFields: type: array description: A list of document fields that is used to semantically embed the field text. items: type: string IndexedDocument: type: object properties: indexId: type: string source: type: object additionalProperties: true score: type: number Query: type: object properties: fullText: type: object additionalProperties: type: string description: Fields used for full text search semanticSearch: description: | A map that maps search strings to fields. type: object additionalProperties: type: string k: type: integer description: The number of results to return that are similar to the search string searchPipelineId: type: string description: | ID of the pipeline to be used for searching. If no pipeline is given, the default pipeline according to the number of search fields is selected. pipelineParameters: "$ref": "#/components/schemas/PipelineExecutionObject" Block: required: - text properties: text: type: string geometry: "$ref": "#/components/schemas/Geometry" type: object title: Block description: The entity block combining recognized text and geometry. BoundingBox: required: - width - height - left - top properties: width: type: number description: The relative width of the entity box. Can be between 0 and 1. height: type: number description: The relative height of the entity box. Can be between 0 and 1. left: type: number description: The relative left-most x-coordinate of the entity box. Can be between 0 and 1. top: type: number description: The relative top-most y-coordinate of the entity box. Can be between 0 and 1. type: object title: Bounding Box description: The bounding box describing the location of the recognized entity. Code: required: - id - entity - payload - type properties: id: type: string entity: "$ref": "#/components/schemas/Entity" tag: type: string payload: type: string format: binary type: type: string enum: - UPC_A - UPC_E - EAN_8 - EAN_13 - UPC_EAN_EXTENSION - CODE_39 - CODE_93 - CODE_128 - CODEBAR - ITF - QR_CODE - DATA_MATRIX - AZTEC - PDF_417 - MAXICODE - RSS_14 - RSS_EXPANDED title: Code Type type: object title: 'Code entity' Collection: required: - source properties: source: type: string enum: - ENTITY_TEXT - ENTITY_VALUE - ENTITY_DATA - KEY_VALUE - KEY_DATA - KEY_TEXT - VALUE_TEXT - VALUE_VALUE - VALUE_DATA - TABLE_CELL_ROW_TAG - CODE title: Collection Source type: object title: Collection Document: description: A document in a media file corresponding to a page. properties: id: type: string tables: description: All tables contained in the document items: "$ref": "#/components/schemas/Table" type: array entities: description: All text entities contained in the document items: "$ref": "#/components/schemas/Entity" type: array keyValueSet: "$ref": "#/components/schemas/KeyValueSet" keyValueSets: description: List of key value sets contained in the document items: "$ref": "#/components/schemas/KeyValueSet" type: array type: object title: Document DocumentPage: required: - page - document properties: page: type: integer description: The page number example: 1 document: "$ref": "#/components/schemas/Document" type: object title: 'Document page' Entity: description: | A single entity in a document. An entity is the central carrier of information and holds properties like text and parsed data (such as numbers or dates converted from text) but also meta information such as location and layout type. required: - id - block properties: id: type: string description: A unique ID for the entity block: "$ref": "#/components/schemas/Block" confidence: type: number description: The confidence score of the entity. Typically a score that is the result of the OCR detection. label: type: string description: | A label for the entity. When converting the observation into JSON object, the labels are used as keys. type: type: string description: | The content type set by the entity parser. When parsing entities not only the type is set but the text is parsed according to entity type. enum: - STRING - NUMBER - QUANTITY - UNIT - DATE - PERCENTAGE - BOOLEAN - ENUMERATION title: Entity Type data: "$ref": "#/components/schemas/EntityData" embedding: items: type: number type: array similarity: "$ref": "#/components/schemas/EntitySimilarity" layoutType: type: string enum: - WORD - LINE title: Entity Layout Type refinement: "$ref": "#/components/schemas/Refinement" description: User-provided correction for this entity's value or location type: object title: Entity EntityData: description: | The entity data holds converted (parsed) values from the entity parser as well as data associated by the entity linker. properties: documentId: type: string description: The document ID corresponding to document in an index that the entity linker assigned to the entity. textValue: type: string description: The text value of the entity (equivalent to the text). quantityValue: type: integer description: The parsed integer number from the text. numberValue: type: number description: The parsed decimal number from the text. unitValue: type: string description: The parsed unit (such as kg, m) from the text. dateValue: type: string format: date description: The parsed date in yyyy-MM-dd format. boolValue: type: boolean description: The parsed boolean from the text. textData: type: string description: Text data from the document field that the entity linker associated to the entity. quantityData: type: integer description: Integer type data from the document field that the entity linker associated to the entity. numberData: type: number description: Decimal type data from the document field that the entity linker associated to the entity. unitData: type: string description: Unit type data from the document field that the entity linker associated to the entity. dateData: type: string format: date description: Date type data from the document field that the entity linker associated to the entity. boolData: type: boolean description: The parsed boolean from the text. field: type: string description: Name of the document field that the entity linker associated to the entity. score: type: number description: The score the entity linker assigned to the association of the document field to the entity. sourceIndex: type: string description: The name of the index the associated document is stored in. type: object title: Entity Data EntityFilter: properties: tag: type: string label: type: string regExp: type: string hasData: type: boolean hasValue: type: boolean layoutType: type: string enum: - WORD - LINE title: Entity Layout Type type: object title: Entity Filter EntitySimilarity: description: A similarity measure between the associated data and parsed value of the entity. required: - type properties: type: type: string enum: - TEXT_SIM - NUMBER_DIFF - BOOL_SAME title: Similarity Type cosineSimilarity: type: number description: The text similarity (Jaccard similarity with k=2) between the text value and text data. amountDiff: type: integer description: The difference between parsed integer number and the corresponding data field numberDiff: type: number description: The difference between parsed decimal number and the corresponding data field same: type: boolean description: Exact match. type: object title: Entity Similarity Geometry: required: - boundingBox properties: polygon: "$ref": "#/components/schemas/Polygon" boundingBox: "$ref": "#/components/schemas/BoundingBox" type: object title: Geometry MediaContent: required: - id - mediaId properties: id: type: string example: media-1 description: The ID of the media content mediaId: type: string example: media-1 description: The ID of the media file (typically equivalent to the ID) documentPages: description: The pages of the document items: "$ref": "#/components/schemas/DocumentPage" type: array mediaHash: type: string description: A hash of the media file codes: description: A list of codes recognized in the media file (e.g. barcodes or datacodes) items: "$ref": "#/components/schemas/Code" type: array metaData: "$ref": "#/components/schemas/ImageMetaData" label: "$ref": "#/components/schemas/Label" rawText: description: The raw text in the media file type: string type: object title: Image Content description: The content of a single media file ImageMetaData: required: - width - height properties: width: type: integer height: type: integer type: object title: Image Meta Data Observation: required: - executionId properties: executionId: type: string example: execution-1 mediaContents: items: "$ref": "#/components/schemas/MediaContent" type: array documents: items: "$ref": "#/components/schemas/LinkedDocument" type: array type: object title: Execution Observation description: The structured content of a set of media files. KeyValuePair: required: - key properties: key: "$ref": "#/components/schemas/Entity" entityValue: "$ref": "#/components/schemas/Entity" keyValueSetValue: "$ref": "#/components/schemas/KeyValueSet" tableValue: "$ref": "#/components/schemas/Table" tag: type: string refinement: "$ref": "#/components/schemas/KeyValuePairRefinement" description: Refinement for this key-value pair association type: object title: Key Value Pair description: | A description of two entities that logically related. The key typically represents text that describes the value entity. The value entity can also represent nested structures such as other key value sets or tables. KeyValueSet: description: A key value set can be a set of arbitrary key value pairs but also a row in a table. properties: id: type: string tag: type: string pairs: items: "$ref": "#/components/schemas/KeyValuePair" type: array description: An array of key value pairs. entity: "$ref": "#/components/schemas/Entity" refinement: "$ref": "#/components/schemas/KeyValueSetRefinement" description: Structural refinement for this key-value set (add/remove pairs, comments) type: object title: Key Value Set Label: required: - index - name - confidence properties: index: type: integer name: type: string confidence: type: number type: object title: Label MatchField: required: - fieldName - clause - collection - mode properties: fieldName: type: string clause: type: string enum: - MUST - MUST_NOT - SHOULD - FILTER title: Bool Clause fuzziness: type: integer auto: type: string filter: "$ref": "#/components/schemas/EntityFilter" collection: "$ref": "#/components/schemas/Collection" threshold: type: number num_results: type: integer mode: type: string enum: - SEARCH - COMPARE - SEARCH_COMPARE title: Field Mode dimension: type: string enum: - EMPTY - ENTITY - TABLE_ROW title: Query Dimension analyzer: type: string type: object title: Match Field MatchGroup: required: - tag - index properties: tag: type: string fields: items: "$ref": "#/components/schemas/MatchField" type: array index: type: string type: object title: Match Group Point: required: - x - y properties: x: type: number description: Relative x-coordinate of the vertex. y: type: number description: Relative y-coordinate of the vertex. type: object title: Point description: A vertex of the polygon that describes the block. Polygon: properties: points: items: "$ref": "#/components/schemas/Point" type: array type: object title: Polygon description: The polygon of the recognized entity. Table: required: - id properties: id: type: string entity: "$ref": "#/components/schemas/Entity" tag: type: string description: Can be used to uniquely identify a table on a page. headers: items: "$ref": "#/components/schemas/Entity" type: array description: The header of a table if present. rows: items: "$ref": "#/components/schemas/KeyValueSet" type: array description: The rows of the table representing the actual content of the table. refinement: "$ref": "#/components/schemas/TableRefinement" description: Structural refinement for this table (add/remove rows, headers, comments) type: object title: Table description: | An entity that describes a list of other entities such as other tables, key value sets (rows of a table) or simple text entities. TopKIndexFilter: required: - index - topk properties: index: type: string topk: type: integer type: object title: Top K Index Filter LinkedDocument: required: - tag properties: document: additionalProperties: true type: object fields: items: "$ref": "#/components/schemas/LinkedField" type: array tag: type: string score: type: number type: object title: Linked Document Error: type: object properties: code: type: string message: type: string required: - code - message Hook: type: object description: | A hook configuration that triggers notifications when specific events occur on an execution. Hooks can send data to external services via webhooks, with support for custom headers and flexible payload formats. required: - eventType - hookType - endpoint - payloadType properties: hookId: type: string description: Unique identifier for the hook (auto-generated if not provided) name: type: string description: Human-readable name for the hook eventType: $ref: '#/components/schemas/HookEventType' hookType: $ref: '#/components/schemas/HookType' endpoint: type: string description: Target URL for webhook delivery example: https://example.com/webhook headers: type: object additionalProperties: type: string description: Custom headers to include in webhook requests (e.g., for authentication) example: Authorization: Bearer your-token X-Custom-Header: custom-value statusFilter: type: array items: type: string description: | For STATUS_CHANGE events: list of statuses that trigger the hook. If empty or not provided, the hook triggers on any status change. example: - COMPLETED - ERROR payloadType: $ref: '#/components/schemas/HookPayloadType' transform: type: string description: | JSONata expression to transform the observation before sending. Required when payloadType is TRANSFORM or FORM_TRANSFORM. For more information see http://docs.jsonata.org/overview.html formPayloadKey: type: string description: | For FORM_TRANSFORM payloads: the form field name for the transformed JSON data. Defaults to "payload" if not specified. default: payload formDocumentKey: type: string description: | For FORM_TRANSFORM payloads: the form field name prefix for media files. Multiple files will be named as {key}[0], {key}[1], etc. if there are multiple files. Defaults to "documents" if not specified. default: documents enabled: type: boolean default: true description: Whether the hook is active createdAt: type: string format: date-time description: Timestamp when the hook was created (auto-set) title: Hook HookEventType: type: string description: The type of event that triggers the hook enum: - STATUS_CHANGE - OBSERVATION_UPDATE - SUBMIT title: Hook Event Type HookType: type: string description: The delivery mechanism for the hook enum: - WEBHOOK - KAFKA - NATS title: Hook Type HookPayloadType: type: string description: How to format the payload sent by the hook enum: - RAW - JSON - TRANSFORM - FORM_TRANSFORM title: Hook Payload Type LinkedField: required: - fieldName properties: documentId: type: string index: type: string score: type: number fieldName: type: string textValue: type: string quantityValue: type: integer numberValue: type: number unitValue: type: string dateValue: type: string format: date-time textData: type: string quantityData: type: integer numberData: type: number unitData: type: string dateData: type: string format: date-time cosineSimilarity: type: number quantityDiff: type: integer numberDiff: type: number same: type: boolean entityId: type: string additionalProperties: true type: object title: Linked Field Refinement: type: object description: | A user-provided correction to an extracted entity value. Refinements are additive - the original extraction is preserved while the correction is stored separately. properties: correctedText: type: string description: The corrected text representation correctedValue: type: string description: The corrected value as string (will be parsed according to entity type) correctedGeometry: $ref: "#/components/schemas/Geometry" description: Optional corrected location/bounding box comment: type: string description: Explanation of why the correction was made timestamp: type: string format: date-time description: When the refinement was created (auto-set if not provided) source: type: string description: Source of the refinement (e.g., "user", "review", "automated") title: Refinement KeyValuePairRefinement: type: object description: Refinement for a key-value pair association properties: comment: type: string description: Comment about the key-value association (e.g., wrong pairing) timestamp: type: string format: date-time source: type: string title: Key Value Pair Refinement TableRefinement: type: object description: Structural refinement for a table properties: addedHeaders: type: array items: $ref: "#/components/schemas/Entity" description: Headers to add to the table deletedHeaderIds: type: array items: type: string description: IDs of headers to mark as deleted addedRows: type: array items: $ref: "#/components/schemas/KeyValueSet" description: Rows to add to the table deletedRowIds: type: array items: type: string description: IDs of rows (KeyValueSet IDs) to mark as deleted comment: type: string description: Description of structural issues with the table timestamp: type: string format: date-time source: type: string title: Table Refinement KeyValueSetRefinement: type: object description: Structural refinement for a key-value set properties: addedPairs: type: array items: $ref: "#/components/schemas/KeyValuePair" description: Key-value pairs to add deletedPairIds: type: array items: type: string description: IDs of key-value pairs to mark as deleted (by key entity ID) comment: type: string description: Description of structural issues with the key-value set timestamp: type: string format: date-time source: type: string title: Key Value Set Refinement EntityRefinementItem: type: object description: A refinement targeting a specific entity by ID required: - entityId - refinement properties: entityId: type: string description: ID of the entity to refine refinement: $ref: "#/components/schemas/Refinement" title: Entity Refinement Item KeyValuePairRefinementItem: type: object description: A refinement targeting a specific key-value pair required: - keyEntityId properties: keyEntityId: type: string description: ID of the key entity in the pair parentKeyValueSetId: type: string description: ID of the parent KeyValueSet (for disambiguation) refinement: $ref: "#/components/schemas/KeyValuePairRefinement" title: Key Value Pair Refinement Item TableRefinementItem: type: object description: A refinement targeting a specific table required: - tableId properties: tableId: type: string description: ID of the table to refine refinement: $ref: "#/components/schemas/TableRefinement" title: Table Refinement Item KeyValueSetRefinementItem: type: object description: A refinement targeting a specific key-value set required: - keyValueSetId properties: keyValueSetId: type: string description: ID of the key-value set to refine refinement: $ref: "#/components/schemas/KeyValueSetRefinement" title: Key Value Set Refinement Item RefinementRequest: type: object description: Batch request for submitting refinements to an observation properties: entityRefinements: type: array description: Refinements for individual entities items: $ref: "#/components/schemas/EntityRefinementItem" keyValuePairRefinements: type: array description: Refinements for key-value pair associations items: $ref: "#/components/schemas/KeyValuePairRefinementItem" tableRefinements: type: array description: Structural refinements for tables items: $ref: "#/components/schemas/TableRefinementItem" keyValueSetRefinements: type: array description: Structural refinements for key-value sets items: $ref: "#/components/schemas/KeyValueSetRefinementItem" title: Refinement Request RefinementSummary: type: object description: Summary of all refinements applied to an observation properties: executionId: type: string totalRefinements: type: integer description: Total count of refinements entityRefinementCount: type: integer structuralRefinementCount: type: integer lastUpdated: type: string format: date-time title: Refinement Summary SubmitResponse: type: object description: Response from the submit endpoint containing payloads sent to hooks properties: hookPayloads: type: array description: List of payloads sent to each hook items: $ref: '#/components/schemas/HookPayloadResult' title: Submit Response HookPayloadResult: type: object description: Result of sending a payload to a specific hook properties: hookId: type: string description: ID of the hook hookName: type: string description: Name of the hook endpoint: type: string description: Target endpoint URL payload: type: object additionalProperties: true description: The exact payload that was sent to the hook endpoint delivered: type: boolean description: Whether the payload was successfully delivered error: type: string description: Error message if delivery failed title: Hook Payload Result securitySchemes: ApiKeyAuth: type: apiKey in: header name: X-API-Key security: - ApiKeyAuth: []