arazzo: 1.0.1 info: title: Hyperbrowser Scrape and Retrieve summary: Start a scrape job for a single URL, poll its status, then fetch the result. description: >- The canonical asynchronous scrape pattern. The workflow starts a scrape job for a single URL with the requested output formats, polls the lightweight status endpoint until the job reaches a terminal state, and then fetches the full job record containing the scraped markdown, html, links, and metadata. Every step spells out its request inline so the flow can be read and executed without opening the underlying OpenAPI description. version: 1.0.0 sourceDescriptions: - name: scrapeApi url: ../openapi/hyperbrowser-scrape-api-openapi.yml type: openapi workflows: - workflowId: scrape-and-retrieve summary: Scrape one page asynchronously and return its extracted content. description: >- Submits a scrape job, waits for it to complete by polling status, and pulls the scraped page data once the job finishes. inputs: type: object required: - apiKey - url properties: apiKey: type: string description: Hyperbrowser account API key sent in the x-api-key header. url: type: string description: The URL of the page to scrape. steps: - stepId: startScrape description: >- Submit a scrape job for the supplied URL and capture the returned jobId used to track and retrieve results. operationId: post-api-scrape parameters: - name: x-api-key in: header value: $inputs.apiKey requestBody: contentType: application/json payload: url: $inputs.url successCriteria: - condition: $statusCode == 200 outputs: jobId: $response.body#/jobId - stepId: pollStatus description: >- Poll the scrape job status. The status moves through pending and running before reaching completed, failed, or stopped; loop back while still in progress and branch out on a terminal state. operationId: get-api-scrape-id-status parameters: - name: x-api-key in: header value: $inputs.apiKey - name: id in: path value: $steps.startScrape.outputs.jobId successCriteria: - condition: $statusCode == 200 outputs: status: $response.body#/status onSuccess: - name: scrapeComplete type: goto stepId: getResult criteria: - context: $response.body condition: $.status == "completed" type: jsonpath - name: scrapeRunning type: goto stepId: pollStatus criteria: - context: $response.body condition: $.status == "pending" || $.status == "running" type: jsonpath - stepId: getResult description: >- Fetch the completed scrape job to return the extracted markdown, html, links, and page metadata. operationId: get-api-scrape-id parameters: - name: x-api-key in: header value: $inputs.apiKey - name: id in: path value: $steps.startScrape.outputs.jobId successCriteria: - condition: $statusCode == 200 - context: $response.body condition: $.status == "completed" type: jsonpath outputs: status: $response.body#/status markdown: $response.body#/data/markdown links: $response.body#/data/links metadata: $response.body#/data/metadata outputs: jobId: $steps.startScrape.outputs.jobId markdown: $steps.getResult.outputs.markdown links: $steps.getResult.outputs.links