openapi: 3.1.0 info: title: UiPath Document Understanding API description: >- The UiPath Document Understanding API enables programmatic access to intelligent document processing capabilities including digitization, classification, extraction, and validation of document content. The API supports both synchronous and asynchronous consumption patterns, with asynchronous mode suited for multi-page documents and batch workloads, and synchronous mode for real-time processing of single-page images up to five pages. Developers can integrate specialized machine learning models as well as generative AI-based classifiers and extractors into their applications. The API is accessible via Swagger or any HTTP-compatible programming language and can be used in both RPA and non-RPA contexts. Data from the digitization endpoint is retained for seven days; data from asynchronous classification and extraction endpoints is retained for 24 hours. version: '1.0' contact: name: UiPath Support url: https://support.uipath.com termsOfService: https://www.uipath.com/legal/terms-of-use externalDocs: description: UiPath Document Understanding API Documentation url: https://docs.uipath.com/document-understanding/automation-cloud/latest/api-guide/api-overview servers: - url: https://cloud.uipath.com/{organizationName}/{tenantName}/du_/api/framework description: UiPath Automation Cloud Document Understanding variables: organizationName: default: your-org description: The name of your UiPath organization tenantName: default: your-tenant description: The name of your UiPath tenant tags: - name: Classification description: Classify documents into predefined document types - name: Digitization description: Convert documents into a digitized format for downstream processing - name: Discovery description: Discover available projects, classifiers, and extractors - name: Extraction description: Extract structured data fields from documents - name: Validation description: Validate and correct digitization, classification, and extraction results security: - bearerAuth: [] paths: /projects: get: operationId: listProjects summary: UiPath List Document Understanding Projects description: >- Retrieves all available Document Understanding projects accessible to the authenticated user. Each project contains configuration for classifiers and extractors and provides URLs for downstream discovery operations. Use project IDs to scope classification and extraction requests to a specific project's models. tags: - Discovery parameters: - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 responses: '200': description: A list of available Document Understanding projects content: application/json: schema: type: object properties: projects: type: array items: $ref: '#/components/schemas/Project' examples: listProjects200Example: summary: Default listProjects 200 response x-microcks-default: true value: projects: - id: {} name: {} description: {} extractorsUrl: {} classifiersUrl: {} '401': $ref: '#/components/responses/Unauthorized' '403': $ref: '#/components/responses/Forbidden' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/extractors: get: operationId: listExtractors summary: UiPath List Extractors for a Project description: >- Retrieves all available extractor models configured within a specific Document Understanding project. Each extractor is associated with a document type and defines the fields it can extract. Use the extractor ID in extraction requests to invoke the correct model. tags: - Discovery parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 responses: '200': description: A list of extractors for the project content: application/json: schema: type: object properties: extractors: type: array items: $ref: '#/components/schemas/Extractor' examples: listExtractors200Example: summary: Default listExtractors 200 response x-microcks-default: true value: extractors: - id: {} name: {} documentType: {} extractorType: {} '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/classifiers: get: operationId: listClassifiers summary: UiPath List Classifiers for a Project description: >- Retrieves all available classifier models configured within a specific Document Understanding project. Classifiers identify the document type of a digitized document. Specialized classifiers use machine learning models trained on specific document types; generative classifiers use large language models. tags: - Discovery parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 responses: '200': description: A list of classifiers for the project content: application/json: schema: type: object properties: classifiers: type: array items: $ref: '#/components/schemas/Classifier' examples: listClassifiers200Example: summary: Default listClassifiers 200 response x-microcks-default: true value: classifiers: - id: {} name: {} classifierType: {} documentTypes: {} '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/digitization/start: post: operationId: digitizeDocument summary: UiPath Digitize a Document description: >- Submits a document for digitization, converting it into a structured digital representation that can be used for classification, extraction, and validation. Supported input formats are JPEG, PNG, PDF, and TIFF. The response includes a documentId that must be used in subsequent classification, extraction, and validation requests. Digitized results are retained for seven days. tags: - Digitization parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/DigitizationRequest' examples: digitizeDocumentRequestExample: summary: Default digitizeDocument request x-microcks-default: true value: documentId: example-value contentType: image/jpeg documentBase64: example-value responses: '200': description: Document digitized successfully content: application/json: schema: $ref: '#/components/schemas/DigitizationResult' examples: digitizeDocument200Example: summary: Default digitizeDocument 200 response x-microcks-default: true value: documentId: example-value status: Succeeded pageCount: 1 '400': $ref: '#/components/responses/BadRequest' '401': $ref: '#/components/responses/Unauthorized' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/classification/start: post: operationId: classifyDocumentAsync summary: UiPath Start Asynchronous Document Classification description: >- Submits a previously digitized document for asynchronous classification. Returns a requestId to poll for results. Asynchronous mode is suited for multi-page documents and concurrent batch workloads. Results are retained for 24 hours. Use GET /classification/{requestId}/result to retrieve the classification outcome once processing is complete. tags: - Classification parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ClassificationRequest' examples: classifyDocumentAsyncRequestExample: summary: Default classifyDocumentAsync request x-microcks-default: true value: documentId: example-value classifiersOptions: - {} responses: '200': description: Classification job started; use the returned requestId to poll for results content: application/json: schema: $ref: '#/components/schemas/AsyncJobStartResponse' examples: classifyDocumentAsync200Example: summary: Default classifyDocumentAsync 200 response x-microcks-default: true value: requestId: example-value status: NotStarted '400': $ref: '#/components/responses/BadRequest' '401': $ref: '#/components/responses/Unauthorized' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/classification/{requestId}/result: get: operationId: getClassificationResult summary: UiPath Get Asynchronous Classification Result description: >- Retrieves the result of an asynchronous classification request by its requestId. The job status can be NotStarted, Running, Failed, or Succeeded. Poll this endpoint until the status is Succeeded or Failed. Results are available for 24 hours from the time the job completed. tags: - Classification parameters: - $ref: '#/components/parameters/projectId' example: example-value - $ref: '#/components/parameters/requestId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 responses: '200': description: Classification result or current job status content: application/json: schema: $ref: '#/components/schemas/ClassificationResult' examples: getClassificationResult200Example: summary: Default getClassificationResult 200 response x-microcks-default: true value: requestId: example-value status: NotStarted result: documentId: example-value classificationResults: - {} '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/extraction/start: post: operationId: extractDocumentAsync summary: UiPath Start Asynchronous Document Extraction description: >- Submits a previously digitized document for asynchronous data extraction. The extractors to apply and the document type must be specified in the request body. Returns a requestId to poll for results. Results are retained for 24 hours. Use GET /extraction/{requestId}/result to retrieve extracted field values once processing completes. tags: - Extraction parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ExtractionRequest' examples: extractDocumentAsyncRequestExample: summary: Default extractDocumentAsync request x-microcks-default: true value: documentId: example-value documentTypeId: example-value extractorsOptions: - {} responses: '200': description: Extraction job started; use the returned requestId to poll for results content: application/json: schema: $ref: '#/components/schemas/AsyncJobStartResponse' examples: extractDocumentAsync200Example: summary: Default extractDocumentAsync 200 response x-microcks-default: true value: requestId: example-value status: NotStarted '400': $ref: '#/components/responses/BadRequest' '401': $ref: '#/components/responses/Unauthorized' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/extraction/{requestId}/result: get: operationId: getExtractionResult summary: UiPath Get Asynchronous Extraction Result description: >- Retrieves the result of an asynchronous extraction request by its requestId. The job status can be NotStarted, Running, Failed, or Succeeded. Poll this endpoint until the status is Succeeded or Failed. Returns extracted field values with associated confidence scores. Results are available for 24 hours from the time the job completed. tags: - Extraction parameters: - $ref: '#/components/parameters/projectId' example: example-value - $ref: '#/components/parameters/requestId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 responses: '200': description: Extraction result or current job status content: application/json: schema: $ref: '#/components/schemas/ExtractionResult' examples: getExtractionResult200Example: summary: Default getExtractionResult 200 response x-microcks-default: true value: requestId: example-value status: NotStarted result: documentId: example-value extractionResult: {} '401': $ref: '#/components/responses/Unauthorized' '404': $ref: '#/components/responses/NotFound' x-microcks-operation: delay: 0 dispatcher: FALLBACK /projects/{projectId}/validation/start: post: operationId: validateDocumentAsync summary: UiPath Start Asynchronous Document Validation description: >- Submits digitization, classification, and extraction results for asynchronous validation. Validation checks the accuracy of previously processed results and may invoke human-in-the-loop review for low-confidence extractions. Returns a requestId for polling. tags: - Validation parameters: - $ref: '#/components/parameters/projectId' example: example-value - name: api-version in: query required: false description: API version to use for this request schema: type: string default: '1' example: 1.0.0 requestBody: required: true content: application/json: schema: $ref: '#/components/schemas/ValidationRequest' examples: validateDocumentAsyncRequestExample: summary: Default validateDocumentAsync request x-microcks-default: true value: documentId: example-value classificationResult: classifierId: {} documentTypeId: {} confidence: {} startPage: {} endPage: {} extractionResult: ResultsVersion: {} DocumentId: {} Fields: {} responses: '200': description: Validation job started successfully content: application/json: schema: $ref: '#/components/schemas/AsyncJobStartResponse' examples: validateDocumentAsync200Example: summary: Default validateDocumentAsync 200 response x-microcks-default: true value: requestId: example-value status: NotStarted '400': $ref: '#/components/responses/BadRequest' '401': $ref: '#/components/responses/Unauthorized' x-microcks-operation: delay: 0 dispatcher: FALLBACK components: securitySchemes: bearerAuth: type: http scheme: bearer bearerFormat: JWT description: >- OAuth 2.0 Bearer token obtained from the UiPath Identity Server. Credentials can also be provided using App ID and App Secret for non-RPA integrations. parameters: projectId: name: projectId in: path required: true description: Unique identifier of the Document Understanding project schema: type: string requestId: name: requestId in: path required: true description: Unique identifier of the asynchronous job request to retrieve results for schema: type: string responses: BadRequest: description: The request was malformed or contained invalid parameters content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' Unauthorized: description: The request lacks valid authentication credentials content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' Forbidden: description: The authenticated user does not have permission to perform this action content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' NotFound: description: The requested resource was not found content: application/json: schema: $ref: '#/components/schemas/ErrorResponse' schemas: Project: type: object description: A Document Understanding project containing classifiers and extractors properties: id: type: string description: Unique identifier of the project example: abc123 name: type: string description: Display name of the project example: Example Name description: type: string description: Optional description of the project's purpose example: Example description for this resource. extractorsUrl: type: string format: uri description: URL to retrieve the list of extractors for this project example: https://cloud.uipath.com/example classifiersUrl: type: string format: uri description: URL to retrieve the list of classifiers for this project example: https://cloud.uipath.com/example Extractor: type: object description: A model that extracts structured data fields from a document properties: id: type: string description: Unique identifier of the extractor example: abc123 name: type: string description: Display name of the extractor example: Example Name documentType: type: string description: Document type this extractor is designed for example: Standard extractorType: type: string enum: [Specialized, GenerativeAI, FormExtractor, IntelligentFormExtractor] description: Technology category of the extractor model example: Specialized Classifier: type: object description: A model that classifies documents into predefined document types properties: id: type: string description: Unique identifier of the classifier example: abc123 name: type: string description: Display name of the classifier example: Example Name classifierType: type: string enum: [Specialized, GenerativeAI, KeywordBased] description: Technology category of the classifier model example: Specialized documentTypes: type: array items: type: string description: List of document type labels this classifier can identify example: Standard DigitizationRequest: type: object description: Request payload for digitizing a document required: - documentId properties: documentId: type: string description: >- Reference identifier for the document. This can be a file path or a document reference key used by UiPath activities. example: abc123 contentType: type: string enum: [image/jpeg, image/png, application/pdf, image/tiff] description: MIME type of the document being submitted example: image/jpeg documentBase64: type: string description: Base64-encoded document content when submitting inline example: example-value DigitizationResult: type: object description: Result of a document digitization operation properties: documentId: type: string description: >- Unique identifier assigned to the digitized document. Use this ID in subsequent classification, extraction, and validation calls. Results are retained for seven days. example: abc123 status: type: string enum: [Succeeded, Failed] description: Outcome of the digitization operation example: Succeeded pageCount: type: integer description: Number of pages detected in the document example: 42 AsyncJobStartResponse: type: object description: Response returned when an asynchronous processing job is started properties: requestId: type: string description: Unique identifier of the asynchronous job. Use this to poll for results. example: abc123 status: type: string enum: [NotStarted, Running, Failed, Succeeded] description: Initial status of the asynchronous job example: NotStarted ClassificationRequest: type: object description: Request payload for classifying a previously digitized document required: - documentId - classifiersOptions properties: documentId: type: string description: Document ID returned by the digitization endpoint example: abc123 classifiersOptions: type: array items: $ref: '#/components/schemas/ClassifierOption' description: Classifiers to apply with their configuration options example: [] ClassifierOption: type: object description: Configuration for applying a specific classifier required: - classifierId properties: classifierId: type: string description: Unique identifier of the classifier to apply example: abc123 ClassificationResult: type: object description: Result of an asynchronous classification job properties: requestId: type: string description: Unique identifier of the classification job example: abc123 status: type: string enum: [NotStarted, Running, Failed, Succeeded] description: Current status of the classification job example: NotStarted result: type: object description: Classification output, populated when status is Succeeded properties: documentId: type: string description: Document ID that was classified classificationResults: type: array items: $ref: '#/components/schemas/ClassificationResultItem' example: example-value ClassificationResultItem: type: object description: Classification result for a specific document or page range properties: classifierId: type: string description: Identifier of the classifier that produced this result example: abc123 documentTypeId: type: string description: Identified document type identifier example: abc123 confidence: type: number format: float minimum: 0 maximum: 1 description: Confidence score of the classification result (0.0 to 1.0) example: 1.0 startPage: type: integer description: First page of the classified document section example: 1 endPage: type: integer description: Last page of the classified document section example: 1 ExtractionRequest: type: object description: Request payload for extracting data from a previously digitized document required: - documentId - extractorsOptions properties: documentId: type: string description: Document ID returned by the digitization endpoint example: abc123 documentTypeId: type: string description: Document type identifier to guide extraction example: abc123 extractorsOptions: type: array items: $ref: '#/components/schemas/ExtractorOption' description: Extractors to apply with their configuration options example: [] ExtractorOption: type: object description: Configuration for applying a specific extractor required: - extractorId properties: extractorId: type: string description: Unique identifier of the extractor to apply example: abc123 ExtractionResult: type: object description: Result of an asynchronous extraction job properties: requestId: type: string description: Unique identifier of the extraction job example: abc123 status: type: string enum: [NotStarted, Running, Failed, Succeeded] description: Current status of the extraction job example: NotStarted result: type: object description: Extraction output, populated when status is Succeeded properties: documentId: type: string description: Document ID from which data was extracted extractionResult: $ref: '#/components/schemas/ExtractionResultData' example: example-value ExtractionResultData: type: object description: Structured extraction result containing field values and confidence scores properties: ResultsVersion: type: integer description: Version of the extraction results format example: 1.0.0 DocumentId: type: string description: Document identifier of the extracted document example: abc123 Fields: type: array items: $ref: '#/components/schemas/ExtractedField' description: List of extracted field values example: [] ExtractedField: type: object description: A single extracted field with its value and confidence metadata properties: FieldId: type: string description: Identifier of the extracted field as defined in the extractor schema example: abc123 FieldName: type: string description: Display name of the extracted field example: Example Name IsMissing: type: boolean description: Whether the field was not found in the document example: true Value: $ref: '#/components/schemas/FieldValue' FieldValue: type: object description: The extracted value with confidence and source reference properties: Value: description: Extracted value, which may be a string, number, or object depending on field type example: example-value Confidence: type: number format: float minimum: 0 maximum: 1 description: Confidence score for the extracted value (0.0 to 1.0) example: 1.0 OcrConfidence: type: number format: float minimum: 0 maximum: 1 description: OCR confidence score for the text underlying this value example: 1.0 TextType: type: string enum: [Printed, Handwritten] description: Whether the source text was printed or handwritten example: Printed ValidationRequest: type: object description: Request payload for validating document processing results required: - documentId properties: documentId: type: string description: Document ID of the document whose results are to be validated example: abc123 classificationResult: $ref: '#/components/schemas/ClassificationResultItem' extractionResult: $ref: '#/components/schemas/ExtractionResultData' ErrorResponse: type: object description: Standard error response body properties: message: type: string description: Human-readable error message example: example-value errorCode: type: string description: Error code identifier example: example-value traceId: type: string description: Trace identifier for support and debugging example: abc123