openapi: 3.0.3 info: version: '1.0' title: Cambrion API description: | The official Cambrion API specification. To receive a free API key reach out at info@cambrion.de with brief description of your use-case. servers: - url: http://localhost:8080 - url: https://api.cambrion.io/v1 tags: - name: Executions description: Execution environment that store results. externalDocs: description: Find out more url: https://docs.cambrion.io/docs/workflows/executions - name: Pipelines description: Machine learning pipelines. externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs/workflows/pipeline - name: Deployments description: (TODO) Deployed Pipelines. externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs - name: Indices description: Indices externalDocs: description: Find out more in the documentation url: https://docs.cambrion.io/docs/workflows/linker paths: /executions: post: summary: Creates an execution description: | Create an execution from an ID (optional). If an execution ID is given that ID will be used, otherwise a new one is created. If the execution already exists, it will be ignored and 204 will be returned. If a new execution was created, 200 is returned with the execution ID as body. tags: - Executions requestBody: $ref: '#/components/requestBodies/ExecutionRequest' responses: '200': $ref: '#/components/responses/ExecutionResponse' '204': description: Execution existing '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' get: summary: Gets all executions tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionTag' responses: '200': $ref: '#/components/responses/ExecutionsResponse' '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' /executions/{executionId}: get: summary: Gets execution tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' responses: '200': $ref: '#/components/responses/ExecutionResponse' '400': description: Invalid Execution ID. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' /executions/{executionId}/observation/media: post: summary: Add media to an observation tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' requestBody: content: image/jpeg: schema: type: string format: base64 image/png: schema: type: string format: base64 application/pdf: schema: type: string format: base64 responses: '200': $ref: '#/components/responses/MediaIdResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/media/{mediaId}: get: summary: Retrieve a specific media tags: - Executions parameters: - $ref: '#/components/parameters/ExecutionId' - $ref: '#/components/parameters/MediaId' responses: '200': $ref: '#/components/responses/MediaResponse' /executions/{executionId}/observation: post: summary: Merge a raw observation into the current observation description: The raw observation is merged into the current observation context. parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: $ref: '#/components/requestBodies/Observation' responses: '204': description: Update observation '400': description: Invalid observation content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' get: summary: Get observation description: Get a full observation of the execution. parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions responses: '200': $ref: '#/components/responses/ObservationResponse' '400': description: Bad request content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/transform: post: summary: Transform an observation description: Transform a raw observation into an object using a JSONata statement. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: content: text/plain: schema: type: string responses: '200': $ref: '#/components/responses/TransformerResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/observation/json: get: summary: Transform an observation into JSON description: | Transform a raw observation into the corresponding JSON object. The values in the JSON object correspond to the data values in the observation. If data values are not available, the raw text is used. parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions responses: '200': $ref: '#/components/responses/JsonResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /executions/{executionId}/link: post: summary: Link results of an execution description: Link contents of observation to documents in an index. parameters: - $ref: '#/components/parameters/ExecutionId' tags: - Executions requestBody: $ref: '#/components/requestBodies/LinkerRequest' responses: '200': $ref: '#/components/responses/LinkerResponse' '400': description: Bad linker parameters content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines: get: summary: Get all deployed pipelines tags: - Pipelines responses: '200': description: Get list of all available pipelines content: application/json: schema: type: array items: $ref: '#/components/schemas/Pipeline' '401': $ref: '#/components/responses/Unauthorized' post: summary: Create new pipeline tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineRequest' responses: '200': description: Pipeline created content: application/json: schema: type: object properties: pipelineId: type: string '400': description: Invalid pipeline. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}: get: summary: Get a specific pipeline tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' responses: '200': $ref: '#/components/responses/PipelineResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' put: summary: Update an existing pipeline tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineRequest' responses: '200': $ref: '#/components/responses/PipelineResponse' '400': description: Invalid pipeline. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete a specific pipeline tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' responses: '204': description: Deleted '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/definition: get: summary: Get graph representation (definition) of a pipeline tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' responses: '200': $ref: '#/components/responses/PipelineDefinitionResponse' '400': description: Bad request. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync: post: summary: Execute pipeline synchronously description: Execute a pipeline synchronously and return the corresponding observation tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/PipelineSyncResponse' '400': description: Invalid execution request (i.e. Execution not found!). content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync/transform: post: summary: Transform an observation description: | Execute a pipeline synchronously and return the transformed observation parameters: - $ref: '#/components/parameters/PipelineId' tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/TransformerResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeSync/json: post: summary: Transform an observation description: | Execute a pipeline synchronously and return the corresponding JSON object. parameters: - $ref: '#/components/parameters/PipelineId' tags: - Pipelines requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/JsonResponse' '400': description: Bad transform object. content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /pipelines/{pipelineId}/executeAsync: post: summary: Execute pipeline asynchronously tags: - Pipelines parameters: - $ref: '#/components/parameters/PipelineId' requestBody: $ref: '#/components/requestBodies/PipelineExecutionRequest' responses: '200': $ref: '#/components/responses/PipelineAsyncResponse' /models: get: summary: Get all registered models tags: - Models responses: '200': description: Get list of all registered models content: application/json: schema: type: array items: $ref: '#/components/schemas/Model' post: summary: Register an uploaded model (currently internal only) tags: - Models requestBody: $ref: '#/components/requestBodies/ModelRequest' responses: '200': description: Deployment successful '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices: get: summary: Get all indices description: Get a list of all indices. tags: - Indices parameters: - $ref: '#/components/parameters/IndexLimit' - $ref: '#/components/parameters/IndexOffset' responses: '200': description: A list of all indices content: application/json: schema: type: array items: $ref: '#/components/schemas/Index' post: summary: Create index description: Create a new index with an optional schema. tags: - Indices requestBody: $ref: '#/components/requestBodies/IndexRequest' responses: '204': description: Index created '400': description: Invalid invalid request content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' /indices/{indexId}: get: summary: Get index description description: | Get the description of an index. Including the data model if present. tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' responses: '200': $ref: '#/components/responses/IndexResponse' '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete index description: Delete an index. tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' responses: '204': description: Index deleted. '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/query: post: summary: Query an index with a search string description: | Query an index with a search string tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' requestBody: $ref: '#/components/requestBodies/QueryRequest' responses: '200': $ref: '#/components/responses/QueryResponse' '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/documents: get: summary: Get all documents tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentLimit' - $ref: '#/components/parameters/DocumentOffset' responses: '200': description: A list of all indices content: application/json: schema: type: array items: $ref: '#/components/schemas/IndexedDocument' post: summary: Create document description: | Create a JSON document in an index. If the index does not exist it will be created automatically. tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' requestBody: $ref: '#/components/requestBodies/DocumentRequest' responses: '204': description: Document stored successfully '400': description: Invalid index ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' /indices/{indexId}/documents/{documentId}: get: summary: Get document tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentId' responses: '200': $ref: '#/components/responses/DocumentResponse' '400': description: Invalid index ID or document ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' delete: summary: Delete document tags: - Indices parameters: - $ref: '#/components/parameters/IndexId' - $ref: '#/components/parameters/DocumentId' responses: '204': description: Document deleted '400': description: Invalid index ID or document ID content: application/json: schema: $ref: '#/components/schemas/Error' '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' components: requestBodies: PipelineExecutionRequest: description: Execution request for a pipeline content: application/json: schema: $ref: '#/components/schemas/PipelineExecutionObject' ExecutionRequest: description: Execution is the context which holds data related to a specific execution content: application/json: schema: $ref: '#/components/schemas/Execution' ModelRequest: description: Model request content: application/json: schema: $ref: '#/components/schemas/Model' PipelineRequest: description: Pipeline request content: application/json: schema: type: object properties: pipelineId: type: string example: receipt-pipeline name: type: string example: receipt-pipeline deploy: type: boolean default: true description: Whether to deploy the pipeline when creating/updating it description: type: string example: A pipeline to extract contents from a receipt version: type: integer example: 1 pipelineDefinition: $ref: '#/components/schemas/PipelineDefinition' Observation: description: Observation request content: application/json: schema: $ref: '#/components/schemas/Observation' MediaRequest: description: Observation request content: image/png: schema: type: string format: binary application/pdf: schema: type: string format: binary text/plain: schema: type: string LinkerRequest: description: Linker request content: application/json: schema: $ref: '#/components/schemas/EntityLinkerConfig' IndexRequest: description: Index creation request content: application/json: schema: $ref: '#/components/schemas/Index' DocumentRequest: description: Document request content: application/json: schema: $ref: '#/components/schemas/IndexedDocument' QueryRequest: description: Query request content: application/json: schema: $ref: '#/components/schemas/Query' responses: ExecutionResponse: description: Response of an execution content: application/json: schema: $ref: '#/components/schemas/Execution' ExecutionsResponse: description: Response of an execution content: application/json: schema: type: array items: $ref: '#/components/schemas/Execution' PipelineResponse: description: Response to the creation of a new pipeline content: application/json: schema: type: object properties: pipeline: $ref: '#/components/schemas/Pipeline' pipelineDefinition: $ref: '#/components/schemas/PipelineDefinition' PipelineStatusResponse: description: Status response of pipeline content: application/json: schema: $ref: '#/components/schemas/Pipeline' PipelineDefinitionResponse: description: Pipeline in definition in YAML content: application/json: schema: $ref: '#/components/schemas/PipelineDefinition' PipelineSyncResponse: description: Response of synchronous pipeline execution content: application/json: schema: type: object properties: executionId: type: string observation: $ref: "#/components/schemas/Observation" PipelineDeploymentResponse: description: Response of pipeline deployment content: application/json: schema: type: object properties: pipelineDeploymentId: type: string PipelineAsyncResponse: description: Response of asynchronous pipeline execution content: application/json: schema: type: object properties: executionId: type: string ModelResponse: description: Response to a model request content: application/json: schema: type: object properties: modelName: type: string MediaResponse: description: Specific media content: image/*: schema: type: string format: binary application/pdf: schema: type: string format: binary text/plain: schema: type: string MediaIdResponse: description: Update media ID content: application/json: schema: type: object properties: mediaId: type: string ObservationResponse: description: Observation response content: application/json: schema: $ref: "#/components/schemas/Observation" LinkerResponse: description: Linker response content: application/json: schema: type: array items: $ref: '#/components/schemas/LinkedDocument' DocumentResponse: description: Document response content: application/json: schema: $ref: '#/components/schemas/IndexedDocument' TransformerResponse: description: | Response of after transformation. The output can be an arbitrary object created by a JSONata expression. It must be deserialized. content: text/plain: schema: type: string JsonResponse: description: | Response of after the observation was transformed to the corresponding JSON object. content: application/json: schema: additionalProperties: true IndexResponse: description: Response of after transformation content: application/json: schema: $ref: '#/components/schemas/Index' QueryResponse: description: List of documents that matched the query content: application/json: schema: type: array items: $ref: '#/components/schemas/IndexedDocument' NotFound: description: The specified resource was not found content: application/json: schema: $ref: '#/components/schemas/Error' Unauthorized: description: Unauthorized content: application/json: schema: $ref: '#/components/schemas/Error' PipelineAlreadyDeployed: description: The pipeline to be deployed was already deployed content: application/json: schema: $ref: '#/components/schemas/Pipeline' parameters: PipelineId: name: pipelineId in: path description: ID of the pipeline to execute required: true schema: type: string ModelId: name: modelId in: path description: ID of the model required: true schema: type: string DeploymentId: name: deploymentId in: path description: ID of the pipeline deployment required: true schema: type: string ExecutionId: name: executionId in: path description: ID of an execution required: true schema: type: string MediaId: name: mediaId in: path description: ID of uploaded media required: true schema: type: string IndexId: name: indexId in: path description: ID of the index required: true schema: type: string example: Warehouse-Index DocumentId: name: documentId in: path description: ID of a document required: true schema: type: string ExecutionTag: name: tag in: query description: Filter executions by tag schema: type: string IndexLimit: name: limit in: query description: Limits the number of indices on a page schema: type: integer IndexOffset: name: offset in: query description: Specifies the page number of the indices to be displayed schema: type: integer DocumentLimit: name: limit in: query description: Limits the number of documents on a page schema: type: integer DocumentOffset: name: offset in: query description: Specifies the page number of the documents to be displayed schema: type: integer schemas: Model: type: object properties: name: type: string description: type: string version: type: integer resourceTag: type: string Pipeline: type: object properties: pipelineId: type: string description: ID of the pipeline name: type: string description: Name of the pipeline description: type: string description: Description of the pipeline tag: type: string description: A tag that can be used to group pipelines status: type: string description: Current status of the pipeline version: type: integer description: Version of the pipeline PipelineDefinition: type: object properties: pipelineDefinitionId: type: string example: receipt-pipeline-definition nodes: type: array description: The nodes of graph that describes pipeline items: "$ref": "#/components/schemas/PipelineNode" example: - modelId: ocr_recognizer modelName: ocr_recognizer modelVersion: 1 modelParameters: param1: 1 param2: 2 canvas: position: x: 0 y: 250 inputs: inputName: info_array_ocr_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_ocr_output inputShape: - 1 inputType: STRING - modelId: static_layout_recognizer modelName: static_layout_recognizer modelVersion: 1 modelParameters: targetModel: some_model labels: - date - name - amount canvas: position: x: 0 y: 500 inputs: inputName: info_array_static_layout_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_static_layout_output inputShape: - 1 inputType: STRING - modelId: entity_parser modelName: entity_parser modelVersion: 1 modelParameters: date: DATE name: STRING amount: NUMBER canvas: position: x: 0 y: 750 inputs: inputName: info_array_parser_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_parser_output inputShape: - 1 inputType: STRING - modelId: entity_deduplicator modelName: entity_deduplicator modelVersion: 1 modelParameters: keys: - date - name - amount canvas: position: x: 0 y: 1000 inputs: inputName: info_array_deduplicator_input inputShape: - 1 inputType: STRING outputs: inputName: info_array_deduplicator_output inputShape: - 1 inputType: STRING edges: type: array description: The edges of graph that describes pipeline items: "$ref": "#/components/schemas/PipelineEdge" example: - id: edge-1 dataHandle: ocr_result source: ocr_recognizer target: layout_recognizer sourceHandle: info_array_ocr_output targetHandle: info_array_static_layout_input - id: edge-2 dataHandle: recognizer_result source: layout_recognizer target: entity_parser sourceHandle: info_array_static_layout_output targetHandle: info_array_parser_input - id: edge-3 dataHandle: parser_result source: entity_parser target: entity_deduplicator sourceHandle: info_array_parser_output targetHandle: info_array_deduplicator_input PipelineEdge: type: object description: | An edge in the graph describing the pipeline. Multiple edges can originate from a single node. But only a single edge can end at a target. To have multiple edges end up at a single each edge must be assigned to a corresponding input. properties: id: type: string description: The ID of the edge. dataHandle: type: string description: | Name of the handle that holds the data transferred from one node to another. This needs to be used when data goes from one node to multiple other nodes. source: type: string description: Model ID of the source node target: type: string description: Model ID of the target node sourceHandle: type: string description: Name of the output variable of the source node. targetHandle: type: string description: Name of the input variable of the target node. PipelineNode: type: object description: | A node of the graph describing the pipeline. The node corresponds to a single execution of a model. The graph can only have a single input node. The identity model (identity_model) can be used to fan-out the input. properties: modelId: type: string description: ID of the model modelName: type: string description: Name of the model modelVersion: type: integer default: -1 description: Version of the model canvas: type: object description: Properties used to display the node properties: position: type: object properties: x: type: integer y: type: integer modelParameters: type: object additionalProperties: true description: The parameters that will be provided to the underlying model at inference time inputs: type: array description: Inputs of the model. These dependent of the implementation. items: $ref: '#/components/schemas/ModelInput' outputs: type: array description: Outputs of the model. These dependent of the implementation. items: $ref: '#/components/schemas/ModelOutput' ModelInput: type: object properties: inputName: type: string description: Variable name of the input inputShape: type: array items: type: integer inputType: type: string default: "STRING" enum: - STRING - INT_TENSOR - FLOAT_TENSOR ModelOutput: type: object properties: outputName: type: string description: Variable name of the output outputShape: type: array items: type: integer outputType: type: string default: "STRING" enum: - STRING - INT_TENSOR - FLOAT_TENSOR Execution: type: object description: | The execution is a stateful environment in which media (such as images or PDF files) can be stored an used an inputs for pipelines. The ID in the request body is optional (generated if empty) and must be unique. properties: executionId: type: string description: ID of the execution tag: type: string description: Tag to identify the execution createdAt: type: string description: Creation time metaData: type: object additionalProperties: true PipelineExecutionObject: type: object description: | The execution is a stateful environment in which media (such as images or PDF files) can be stored an used an inputs for pipelines. The ID in the request body is optional (generated if empty) and must be unique. Nothing will be persisted if transient is true. In order to trigger the pipeline either an execution ID containing valid media or base 64 encoded media under the media property have to be provided. properties: executionId: type: string description: ID of the execution tag: type: string description: Tag used to identify the resulting execution. Ignored if transient is true. transient: type: boolean description: Whether to delete all execution data after pipeline completion transform: type: string description: | JSONata instruction to transform the result observation into a desired object. JSONata is a transformation language for JSON data. It can be used to transform . For more information see http://docs.jsonata.org/overview.html tryImageConversion: type: boolean description: Tries to convert the provided content to an image (e.g. PDF) default: true trySimpleText: type: boolean description: | Tries to extract readable text from input media (e.g. Word doc). A number of different file formats is supported. Internally Apache Tika is used for text extraction. A full list of supported file formats can be found here: https://tika.apache.org/2.9.1/formats.html default: false idempotent: type: boolean description: Whether to update the existing observation with the results from pipeline run (always true if executionId is null) default: false media: description: | Array of base 64 encoded media files. Content type will be detected automatically. For PDF, Docx, PPTX files the files will be rendered as images. The images can then be processed within a pipeline. items: type: string format: base64 type: array #runtimeParameters: # dsdsadsadsad text: type: string description: Raw text that can be used as input in pipelines EntityLinkerConfig: properties: group: items: "$ref": "#/components/schemas/MatchGroup" type: array document: type: object additionalProperties: true topk: items: "$ref": "#/components/schemas/TopKIndexFilter" type: array type: object title: Entity Link Request Index: type: object properties: indexId: type: string description: Unique ID of the index. semanticSearchField: type: string description: This field is used to encode semantic information indexSchema: type: object additionalProperties: true description: | An object that describe the data types of the document fields. This is equivalent to the Elasticsearch mappings object. For more information see https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html IndexedDocument: type: object properties: indexId: type: string source: type: object additionalProperties: true Query: type: object properties: searchString: type: string description: The string that is used for the search k: type: integer description: The number of results to return that are similar to the search string Block: required: - text properties: text: type: string geometry: "$ref": "#/components/schemas/Geometry" type: object title: Block description: The entity block combining recognized text and geometry. BoundingBox: required: - width - height - left - top properties: width: type: number description: The relative width of the entity box. Can be between 0 and 1. height: type: number description: The relative height of the entity box. Can be between 0 and 1. left: type: number description: The relative left-most x-coordinate of the entity box. Can be between 0 and 1. top: type: number description: The relative top-most y-coordinate of the entity box. Can be between 0 and 1. type: object title: Bounding Box description: The bounding box describing the location of the recognized entity. Code: required: - id - entity - payload - type properties: id: type: string entity: "$ref": "#/components/schemas/Entity" tag: type: string payload: type: string format: binary type: type: string enum: - UPC_A - UPC_E - EAN_8 - EAN_13 - UPC_EAN_EXTENSION - CODE_39 - CODE_93 - CODE_128 - CODEBAR - ITF - QR_CODE - DATA_MATRIX - AZTEC - PDF_417 - MAXICODE - RSS_14 - RSS_EXPANDED title: Code Type type: object title: 'Code entity' Collection: required: - source properties: source: type: string enum: - ENTITY_TEXT - ENTITY_VALUE - ENTITY_DATA - KEY_VALUE - KEY_DATA - KEY_TEXT - VALUE_TEXT - VALUE_VALUE - VALUE_DATA - TABLE_CELL_ROW_TAG - CODE title: Collection Source type: object title: Collection Document: description: A document in a media file corresponding to a page. properties: id: type: string tables: description: All tables contained in the document items: "$ref": "#/components/schemas/Table" type: array entities: description: All text entities contained in the document items: "$ref": "#/components/schemas/Entity" type: array keyValueSet: "$ref": "#/components/schemas/KeyValueSet" type: object title: Document DocumentPage: required: - page - document properties: page: type: integer description: The page number example: 1 document: "$ref": "#/components/schemas/Document" type: object title: 'Document page' Entity: description: | A single entity in a document. An entity is the central carrier of information and holds properties like text and parsed data (such as numbers or dates converted from text) but also meta information such as location and layout type. required: - id - block properties: id: type: string description: A unique ID for the entity block: "$ref": "#/components/schemas/Block" confidence: type: number description: The confidence score of the entity. Typically a score that is the result of the OCR detection. label: type: string description: | A label for the entity. When converting the observation into JSON object, the labels are used as keys. type: type: string description: | The content type set by the entity parser. When parsing entities not only the type is set but the text is parsed according to entity type. enum: - STRING - NUMBER - QUANTITY - UNIT - DATE - PERCENTAGE title: Entity Type data: "$ref": "#/components/schemas/EntityData" embedding: items: type: number type: array similarity: "$ref": "#/components/schemas/EntitySimilarity" layoutType: type: string enum: - WORD - LINE title: Entity Layout Type type: object title: Entity EntityData: description: | The entity data holds converted (parsed) values from the entity parser as well as data associated by the entity linker. properties: documentId: type: string description: The document ID corresponding to document in an index that the entity linker assigned to the entity. textValue: type: string description: The text value of the entity (equivalent to the text). quantityValue: type: integer description: The parsed integer number from the text. numberValue: type: number description: The parsed decimal number from the text. unitValue: type: string description: The parsed unit (such as kg, m) from the text. dateValue: type: string format: date description: The parsed date in yyyy-MM-dd format. textData: type: string description: Text data from the document field that the entity linker associated to the entity. quantityData: type: integer description: Integer type data from the document field that the entity linker associated to the entity. numberData: type: number description: Decimal type data from the document field that the entity linker associated to the entity. unitData: type: string description: Unit type data from the document field that the entity linker associated to the entity. dateData: type: string format: date description: Date type data from the document field that the entity linker associated to the entity. field: type: string description: Name of the document field that the entity linker associated to the entity. score: type: number description: The score the entity linker assigned to the association of the document field to the entity. sourceIndex: type: string description: The name of the index the associated document is stored in. type: object title: Entity Data EntityFilter: properties: tag: type: string label: type: string regExp: type: string hasData: type: boolean hasValue: type: boolean layoutType: type: string enum: - WORD - LINE title: Entity Layout Type type: object title: Entity Filter EntitySimilarity: description: A similarity measure between the associated data and parsed value of the entity. required: - type properties: type: type: string enum: - TEXT_SIM - NUMBER_DIFF - BOOL_SAME title: Similarity Type cosineSimilarity: type: number description: The text similarity (Jaccard similarity with k=2) between the text value and text data. amountDiff: type: integer description: The difference between parsed integer number and the corresponding data field numberDiff: type: number description: The difference between parsed decimal number and the corresponding data field same: type: boolean description: Exact match. type: object title: Entity Similarity Geometry: required: - boundingBox properties: polygon: "$ref": "#/components/schemas/Polygon" boundingBox: "$ref": "#/components/schemas/BoundingBox" type: object title: Geometry MediaContent: required: - id - mediaId properties: id: type: string example: media-1 description: The ID of the media content mediaId: type: string example: media-1 description: The ID of the media file (typically equivalent to the ID) documentPages: description: The pages of the document items: "$ref": "#/components/schemas/DocumentPage" type: array imageHash: type: string description: A hash of the media file codes: description: A list of codes recognized in the media file (e.g. barcodes or datacodes) items: "$ref": "#/components/schemas/Code" type: array metaData: "$ref": "#/components/schemas/ImageMetaData" label: "$ref": "#/components/schemas/Label" rawText: description: The raw text in the media file type: string type: object title: Image Content description: The content of a single media file ImageMetaData: required: - width - height properties: width: type: integer height: type: integer type: object title: Image Meta Data Observation: required: - executionId properties: executionId: type: string example: execution-1 mediaContents: items: "$ref": "#/components/schemas/MediaContent" type: array type: object title: Execution Observation description: The structured content of a set of media files. KeyValuePair: required: - key properties: key: "$ref": "#/components/schemas/Entity" entityValue: "$ref": "#/components/schemas/Entity" keyValueSetValue: "$ref": "#/components/schemas/KeyValueSet" tableValue: "$ref": "#/components/schemas/Table" tag: type: string type: object title: Key Value Pair description: | A description of two entities that logically related. The key typically represents text that describes the value entity. The value entity can also represent nested structures such as other key value sets or tables. KeyValueSet: description: A key value set can be a set of arbitrary key value pairs but also a row in a table. properties: id: type: string tag: type: string pairs: items: "$ref": "#/components/schemas/KeyValuePair" type: array description: An array of key value pairs. entity: "$ref": "#/components/schemas/Entity" type: object title: Key Value Set Label: required: - index - name - confidence properties: index: type: integer name: type: string confidence: type: number type: object title: Label MatchField: required: - fieldName - clause - collection - mode properties: fieldName: type: string clause: type: string enum: - MUST - MUST_NOT - SHOULD - FILTER title: Bool Clause fuzziness: type: integer auto: type: string filter: "$ref": "#/components/schemas/EntityFilter" collection: "$ref": "#/components/schemas/Collection" threshold: type: number num_results: type: integer mode: type: string enum: - SEARCH - COMPARE - SEARCH_COMPARE title: Field Mode dimension: type: string enum: - EMPTY - ENTITY - TABLE_ROW title: Query Dimension analyzer: type: string type: object title: Match Field MatchGroup: required: - tag - index properties: tag: type: string fields: items: "$ref": "#/components/schemas/MatchField" type: array index: type: string type: object title: Match Group Point: required: - x - y properties: x: type: number description: Relative x-coordinate of the vertex. y: type: number description: Relative y-coordinate of the vertex. type: object title: Point description: A vertex of the polygon that describes the block. Polygon: properties: points: items: "$ref": "#/components/schemas/Point" type: array type: object title: Polygon description: The polygon of the recognized entity. Table: required: - id - entity properties: id: type: string entity: "$ref": "#/components/schemas/Entity" tag: type: string description: Can be used to uniquely identify a table on a page. headers: items: "$ref": "#/components/schemas/Entity" type: array description: The header of a table if present. rows: items: "$ref": "#/components/schemas/KeyValueSet" type: array description: The rows of the table representing the actual content of the table. type: object title: Table description: | An entity that describes a list of other entities such as other tables, key value sets (rows of a table) or simple text entities. TopKIndexFilter: required: - index - topk properties: index: type: string topk: type: integer type: object title: Top K Index Filter LinkedDocument: required: - tag properties: document: additionalProperties: true type: object fields: items: "$ref": "#/components/schemas/LinkedField" type: array tag: type: string additionalProperties: true type: object title: Linked Document Error: type: object properties: code: type: string message: type: string required: - code - message LinkedField: required: - fieldName properties: documentId: type: string index: type: string score: type: number fieldName: type: string textValue: type: string quantityValue: type: integer numberValue: type: number unitValue: type: string dateValue: type: string format: date-time textData: type: string quantityData: type: integer numberData: type: number unitData: type: string dateData: type: string format: date-time cosineSimilarity: type: number quantityDiff: type: integer numberDiff: type: number same: type: boolean entityId: type: string additionalProperties: true type: object title: Linked Field securitySchemes: ApiKeyAuth: type: apiKey in: header name: X-API-Key security: - ApiKeyAuth: []