arazzo: 1.0.1 info: title: Amazon Polly List Voices and Synthesize Speech summary: Pick an available voice for a language and synthesize speech with it. description: >- The most common Amazon Polly text-to-speech pattern. The workflow first calls DescribeVoices, optionally filtered by an engine and language code, to discover an available voice. It captures the first returned voice id and then calls SynthesizeSpeech to turn the supplied input text into an audio stream using that voice. Every step spells out its request inline, including the AWS Signature Version 4 signing headers, so the flow can be read and executed without opening the underlying OpenAPI description. version: 1.0.0 sourceDescriptions: - name: pollyApi url: ../openapi/amazon-polly-openapi-original.yaml type: openapi workflows: - workflowId: list-voices-synthesize-speech summary: Discover an available voice for a language and synthesize speech with it. description: >- Lists the voices available for the requested engine and language, selects the first match, and synthesizes the supplied text into an audio stream using that voice and output format. inputs: type: object required: - amzDate - authorization - text - outputFormat properties: amzDate: type: string description: The X-Amz-Date timestamp (e.g. 20260604T120000Z) used to sign the requests. authorization: type: string description: The full SigV4 Authorization header value for the request. contentSha256: type: string description: The X-Amz-Content-Sha256 hex digest of the request payload. securityToken: type: string description: Optional X-Amz-Security-Token for temporary credentials. engine: type: string description: Engine to filter voices and to use for synthesis (standard or neural). languageCode: type: string description: ISO language code to filter voices by (e.g. en-US). text: type: string description: The input text (plain text or SSML) to synthesize. textType: type: string description: Whether the input text is plain text or ssml. outputFormat: type: string description: The audio output format (mp3, ogg_vorbis, pcm, or json). steps: - stepId: describeVoices description: >- List the voices available for the requested engine and language so a concrete voice id can be selected for synthesis. operationId: DescribeVoices parameters: - name: Engine in: query value: $inputs.engine - name: LanguageCode in: query value: $inputs.languageCode - name: X-Amz-Date in: header value: $inputs.amzDate - name: Authorization in: header value: $inputs.authorization - name: X-Amz-Content-Sha256 in: header value: $inputs.contentSha256 - name: X-Amz-Security-Token in: header value: $inputs.securityToken successCriteria: - condition: $statusCode == 200 outputs: voices: $response.body#/Voices selectedVoiceId: $response.body#/Voices/0/Id nextToken: $response.body#/NextToken - stepId: synthesizeSpeech description: >- Synthesize the supplied text into an audio stream using the voice selected from the DescribeVoices response. operationId: SynthesizeSpeech parameters: - name: X-Amz-Date in: header value: $inputs.amzDate - name: Authorization in: header value: $inputs.authorization - name: X-Amz-Content-Sha256 in: header value: $inputs.contentSha256 - name: X-Amz-Security-Token in: header value: $inputs.securityToken requestBody: contentType: application/json payload: Engine: $inputs.engine LanguageCode: $inputs.languageCode OutputFormat: $inputs.outputFormat Text: $inputs.text TextType: $inputs.textType VoiceId: $steps.describeVoices.outputs.selectedVoiceId successCriteria: - condition: $statusCode == 200 outputs: contentType: $response.body#/ContentType requestCharacters: $response.body#/RequestCharacters outputs: selectedVoiceId: $steps.describeVoices.outputs.selectedVoiceId requestCharacters: $steps.synthesizeSpeech.outputs.requestCharacters