arazzo: 1.0.1 info: title: Amazon Neptune Bulk Loader Start and Poll summary: Start a bulk loader job from S3 and poll its status until the load completes. description: >- The canonical Neptune bulk-load pattern over the Data API. The workflow starts a bulk loader job from an Amazon S3 source, captures the returned load id, then repeatedly polls the loader job status with detailed feed counts until the overall status reaches LOAD_COMPLETED. A poll loop with a retry delay handles the LOAD_IN_PROGRESS state, and a branch ends the flow once the load finishes. Every step spells out its request inline so the flow can be read and executed without opening the underlying OpenAPI description. version: 1.0.0 sourceDescriptions: - name: neptuneDataApi url: ../openapi/amazon-neptune-data-openapi.yml type: openapi workflows: - workflowId: bulk-loader-start-and-poll summary: Kick off a bulk load from S3 and poll until the load is complete. description: >- Starts a bulk loader job, then polls its status on a loop until the overall status is LOAD_COMPLETED. inputs: type: object required: - source - format - iamRoleArn - region properties: source: type: string description: The S3 URI of the data files or folders to load. format: type: string description: The data format (csv, opencypher, ntriples, nquads, rdfxml, turtle). iamRoleArn: type: string description: The ARN of the IAM role with S3 access. region: type: string description: The AWS Region of the S3 bucket. parallelism: type: string description: The degree of parallelism (LOW, MEDIUM, HIGH, OVERSUBSCRIBE). failOnError: type: string description: Whether to stop the load on error (TRUE or FALSE). steps: - stepId: startLoad description: >- Start a bulk loader job from the supplied S3 source and capture the returned load id. operationId: startLoaderJob requestBody: contentType: application/json payload: source: $inputs.source format: $inputs.format iamRoleArn: $inputs.iamRoleArn region: $inputs.region parallelism: $inputs.parallelism failOnError: $inputs.failOnError successCriteria: - condition: $statusCode == 200 outputs: loadId: $response.body#/payload/loadId - stepId: pollLoad description: >- Poll the loader job status with detailed feed counts. Repeat while the overall status is LOAD_IN_PROGRESS, and finish once it is LOAD_COMPLETED. operationId: getLoaderJobStatus parameters: - name: loadId in: path value: $steps.startLoad.outputs.loadId - name: details in: query value: true successCriteria: - condition: $statusCode == 200 outputs: overallStatus: $response.body#/payload/overallStatus/status totalRecords: $response.body#/payload/overallStatus/totalRecords onSuccess: - name: loadStillRunning type: retry retryAfter: 30 retryLimit: 60 criteria: - context: $response.body condition: $.payload.overallStatus.status == "LOAD_IN_PROGRESS" type: jsonpath - name: loadComplete type: end criteria: - context: $response.body condition: $.payload.overallStatus.status == "LOAD_COMPLETED" type: jsonpath outputs: loadId: $steps.startLoad.outputs.loadId overallStatus: $steps.pollLoad.outputs.overallStatus totalRecords: $steps.pollLoad.outputs.totalRecords