arazzo: 1.0.1 info: title: Dataiku Build Dataset and Poll Job summary: Start a build job for a dataset output and poll the job until it reaches a terminal state. description: >- Builds a dataset in Dataiku DSS and waits for the build to finish. The workflow starts a build job targeting a single dataset output, then polls the job status endpoint, looping while the job is NOT_STARTED or RUNNING and branching to completion when the job reaches DONE, FAILED, or ABORTED. Every step inlines its request so the flow can be executed without opening the underlying OpenAPI description. version: 1.0.0 sourceDescriptions: - name: dssPublicApi url: ../openapi/dataiku-public-api-openapi.yml type: openapi workflows: - workflowId: build-dataset-job summary: Start a dataset build job and poll it to a terminal state. description: >- Starts a build job for one dataset output and polls getJob until the job is DONE, FAILED, or ABORTED. inputs: type: object required: - apiKey - projectKey - datasetName properties: apiKey: type: string description: DSS API key passed as a Bearer token in the Authorization header. projectKey: type: string description: Project key that owns the dataset to build. datasetName: type: string description: Name of the dataset to build (used as the output id). steps: - stepId: startBuild description: Start a build job targeting the dataset as a single output. operationId: startJob parameters: - name: Authorization in: header value: Bearer $inputs.apiKey - name: projectKey in: path value: $inputs.projectKey requestBody: contentType: application/json payload: outputs: - projectKey: $inputs.projectKey id: $inputs.datasetName type: DATASET successCriteria: - condition: $statusCode == 200 outputs: jobId: $response.body#/id - stepId: pollJob description: >- Poll the job status. While the job is NOT_STARTED or RUNNING, loop back and poll again; once it reaches DONE, FAILED, or ABORTED, the workflow branches to completion. operationId: getJob parameters: - name: Authorization in: header value: Bearer $inputs.apiKey - name: projectKey in: path value: $inputs.projectKey - name: jobId in: path value: $steps.startBuild.outputs.jobId successCriteria: - condition: $statusCode == 200 outputs: state: $response.body#/state endTime: $response.body#/endTime onSuccess: - name: stillBuilding type: goto stepId: pollJob criteria: - context: $response.body condition: $.state == "RUNNING" || $.state == "NOT_STARTED" type: jsonpath - name: terminal type: end criteria: - context: $response.body condition: $.state == "DONE" || $.state == "FAILED" || $.state == "ABORTED" type: jsonpath outputs: jobId: $steps.startBuild.outputs.jobId finalState: $steps.pollJob.outputs.state