# Brainiall Text-to-Speech (TTS) API — Full Documentation ## Overview Brainiall TTS API provides production-ready text-to-speech synthesis with 12 natural-sounding English voices (American and British accents). Powered by the Kokoro speech synthesis engine, it delivers 24kHz WAV audio with sub-1 second latency on GPU. An affordable alternative to ElevenLabs, Google Cloud TTS, and Amazon Polly. Pricing: $0.01-0.03 per 1,000 characters. Compare: ElevenLabs $120-198/M chars, Google Cloud TTS $4/M chars. ## Base URL https://apim-ai-apis.azure-api.net/v1/tts ## Authentication Include ONE of these headers in every request: 1. Bearer Token: `Authorization: Bearer YOUR_KEY` 2. API Key: `api-key: YOUR_KEY` 3. Subscription Key: `Ocp-Apim-Subscription-Key: YOUR_KEY` Get your API key at https://brainiall.com ## Endpoints ### POST /v1/tts/synthesize Convert text to speech audio. Returns binary WAV data (24kHz, 16-bit PCM). Request: ```json { "text": "Hello, welcome to our application.", "voice": "af_heart", "speed": 1.0, "format": "wav" } ``` Parameters: - `text` (string, required): Text to synthesize. 1-5000 characters. - `voice` (string, optional): Voice ID. Default: `af_heart`. See voice list below. - `speed` (number, optional): Speech speed multiplier. Range: 0.25-4.0. Default: 1.0. - `format` (string, optional): Output format. Currently: `wav`. Default: `wav`. Response: Binary `audio/wav` data (24kHz, 16-bit PCM mono). Response headers: - `X-Audio-Duration-Ms`: Duration of generated audio in milliseconds - `X-Voice`: Voice ID used for synthesis - `X-Text-Length`: Number of characters processed ### GET /v1/tts/voices List all available TTS voices with metadata. Response: ```json { "voices": [ {"id": "af_heart", "name": "Heart", "gender": "female", "accent": "american"}, {"id": "af_bella", "name": "Bella", "gender": "female", "accent": "american"}, {"id": "af_nicole", "name": "Nicole", "gender": "female", "accent": "american"}, {"id": "af_sarah", "name": "Sarah", "gender": "female", "accent": "american"}, {"id": "af_sky", "name": "Sky", "gender": "female", "accent": "american"}, {"id": "am_adam", "name": "Adam", "gender": "male", "accent": "american"}, {"id": "am_michael", "name": "Michael", "gender": "male", "accent": "american"}, {"id": "bf_emma", "name": "Emma", "gender": "female", "accent": "british"}, {"id": "bf_isabella", "name": "Isabella", "gender": "female", "accent": "british"}, {"id": "bm_george", "name": "George", "gender": "male", "accent": "british"}, {"id": "bm_lewis", "name": "Lewis", "gender": "male", "accent": "british"}, {"id": "bm_daniel", "name": "Daniel", "gender": "male", "accent": "british"} ], "defaultVoice": "af_heart" } ``` Voice ID naming convention: - `af_` = American female - `am_` = American male - `bf_` = British female - `bm_` = British male ### GET /v1/tts/health Health check endpoint. Response: ```json {"status": "healthy", "modelLoaded": true, "voices": 12} ``` ## Code Examples ### Python: Basic Text-to-Speech ```python import requests API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} # Synthesize speech response = requests.post( f"{BASE_URL}/synthesize", headers=HEADERS, json={ "text": "Hello, welcome to our application. How can I help you today?", "voice": "af_heart", "speed": 1.0 } ) # Save the audio file with open("output.wav", "wb") as f: f.write(response.content) print(f"Audio duration: {response.headers.get('X-Audio-Duration-Ms')}ms") print(f"Voice used: {response.headers.get('X-Voice')}") ``` ### Python: Async TTS with httpx ```python import httpx import asyncio from pathlib import Path API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" async def synthesize_async( text: str, voice: str = "af_heart", speed: float = 1.0, client: httpx.AsyncClient | None = None, ) -> bytes: """Async text-to-speech synthesis.""" _client = client or httpx.AsyncClient(timeout=30.0) try: response = await _client.post( f"{BASE_URL}/synthesize", headers={"Ocp-Apim-Subscription-Key": API_KEY}, json={"text": text, "voice": voice, "speed": speed}, ) response.raise_for_status() return response.content finally: if client is None: await _client.aclose() async def batch_synthesize(items: list[dict], output_dir: str = "audio") -> list[str]: """Synthesize multiple texts concurrently. Returns list of file paths.""" Path(output_dir).mkdir(exist_ok=True) async with httpx.AsyncClient(timeout=30.0) as client: tasks = [ synthesize_async( item["text"], item.get("voice", "af_heart"), item.get("speed", 1.0), client=client, ) for item in items ] results = await asyncio.gather(*tasks, return_exceptions=True) paths = [] for i, result in enumerate(results): if isinstance(result, Exception): print(f"Error on item {i}: {result}") continue path = f"{output_dir}/audio_{i:03d}.wav" Path(path).write_bytes(result) paths.append(path) return paths # Usage async def main(): items = [ {"text": "Welcome to our platform.", "voice": "af_heart"}, {"text": "Let's get started with the tutorial.", "voice": "bf_emma"}, {"text": "Thank you for watching.", "voice": "am_michael", "speed": 0.9}, ] paths = await batch_synthesize(items) print(f"Generated {len(paths)} audio files") asyncio.run(main()) ``` ### Python: TTS with Error Handling and Retry ```python import requests import time from pathlib import Path class BrainiallTTS: """Production TTS client with retry logic and error handling.""" def __init__(self, api_key: str, max_retries: int = 3, timeout: float = 30.0): self.base_url = "https://apim-ai-apis.azure-api.net/v1/tts" self.headers = {"Ocp-Apim-Subscription-Key": api_key} self.max_retries = max_retries self.timeout = timeout self._voices_cache = None def synthesize( self, text: str, voice: str = "af_heart", speed: float = 1.0, ) -> dict: """Synthesize text and return audio bytes with metadata.""" if not text or len(text) > 5000: raise ValueError(f"Text must be 1-5000 chars, got {len(text or '')}") if not 0.25 <= speed <= 4.0: raise ValueError(f"Speed must be 0.25-4.0, got {speed}") for attempt in range(self.max_retries): try: response = requests.post( f"{self.base_url}/synthesize", headers=self.headers, json={"text": text, "voice": voice, "speed": speed}, timeout=self.timeout, ) response.raise_for_status() return { "audio": response.content, "duration_ms": int(response.headers.get("X-Audio-Duration-Ms", 0)), "voice": response.headers.get("X-Voice", voice), "text_length": int(response.headers.get("X-Text-Length", len(text))), } except requests.exceptions.RequestException as e: if attempt < self.max_retries - 1: wait = 2 ** attempt print(f"Retry {attempt + 1}/{self.max_retries} after {wait}s: {e}") time.sleep(wait) else: raise def get_voices(self, accent: str | None = None, gender: str | None = None) -> list[dict]: """Get available voices, optionally filtered by accent or gender.""" if self._voices_cache is None: response = requests.get( f"{self.base_url}/voices", headers=self.headers, timeout=self.timeout, ) response.raise_for_status() self._voices_cache = response.json()["voices"] voices = self._voices_cache if accent: voices = [v for v in voices if v["accent"] == accent] if gender: voices = [v for v in voices if v["gender"] == gender] return voices def save(self, text: str, filepath: str, voice: str = "af_heart", speed: float = 1.0): """Synthesize and save to file in one call.""" result = self.synthesize(text, voice, speed) Path(filepath).write_bytes(result["audio"]) return result # Usage tts = BrainiallTTS(api_key="YOUR_KEY") # Get British male voices british_males = tts.get_voices(accent="british", gender="male") print(f"British male voices: {[v['name'] for v in british_males]}") # Synthesize with auto-retry result = tts.save("The quick brown fox jumps over the lazy dog.", "output.wav") print(f"Duration: {result['duration_ms']}ms") ``` ### Python: List Available Voices ```python import requests API_KEY = "YOUR_KEY" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} response = requests.get( "https://apim-ai-apis.azure-api.net/v1/tts/voices", headers=HEADERS ) voices = response.json() for voice in voices["voices"]: print(f"{voice['id']:15s} {voice['name']:10s} {voice['gender']:8s} {voice['accent']}") ``` ### Python: TTS with Voice Selection by Accent ```python import requests API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} def synthesize(text: str, voice: str = "af_heart", speed: float = 1.0) -> bytes: """Synthesize text to speech and return WAV audio bytes.""" response = requests.post( f"{BASE_URL}/synthesize", headers=HEADERS, json={"text": text, "voice": voice, "speed": speed} ) response.raise_for_status() return response.content # American female voice audio = synthesize("Good morning! Let's get started.", voice="af_heart") with open("american_female.wav", "wb") as f: f.write(audio) # British male voice audio = synthesize("Good morning! Let's get started.", voice="bm_george") with open("british_male.wav", "wb") as f: f.write(audio) # Slow speed for language learning audio = synthesize("The quick brown fox jumps over the lazy dog.", voice="bf_emma", speed=0.7) with open("slow_british.wav", "wb") as f: f.write(audio) # Fast narration audio = synthesize("Breaking news from the financial markets today.", voice="am_adam", speed=1.3) with open("fast_narration.wav", "wb") as f: f.write(audio) ``` ### Python: Batch TTS for Multiple Sentences ```python import requests from pathlib import Path API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} sentences = [ "Welcome to our language learning platform.", "Today we will practice pronunciation.", "Listen carefully and repeat after me.", "The weather is beautiful today.", "Let's review what we learned yesterday.", ] output_dir = Path("audio_files") output_dir.mkdir(exist_ok=True) for i, text in enumerate(sentences): response = requests.post( f"{BASE_URL}/synthesize", headers=HEADERS, json={"text": text, "voice": "bf_emma", "speed": 0.9} ) filepath = output_dir / f"sentence_{i+1:02d}.wav" filepath.write_bytes(response.content) duration = response.headers.get("X-Audio-Duration-Ms", "?") print(f"[{i+1}/{len(sentences)}] {filepath.name} ({duration}ms): {text[:50]}") print(f"\nGenerated {len(sentences)} audio files in {output_dir}/") ``` ### Python: Audiobook Chapter Generator ```python import requests import re from pathlib import Path API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} def split_into_chunks(text: str, max_chars: int = 4500) -> list[str]: """Split text into chunks at sentence boundaries, respecting max_chars.""" sentences = re.split(r'(?<=[.!?])\s+', text) chunks = [] current = "" for sentence in sentences: if len(current) + len(sentence) + 1 > max_chars: if current: chunks.append(current.strip()) current = sentence else: current = f"{current} {sentence}" if current else sentence if current: chunks.append(current.strip()) return chunks def generate_audiobook_chapter( text: str, output_dir: str, voice: str = "af_sarah", speed: float = 0.95, ) -> list[str]: """Generate audiobook audio files from long text. Returns list of file paths.""" Path(output_dir).mkdir(parents=True, exist_ok=True) chunks = split_into_chunks(text) paths = [] total_duration_ms = 0 for i, chunk in enumerate(chunks): response = requests.post( f"{BASE_URL}/synthesize", headers=HEADERS, json={"text": chunk, "voice": voice, "speed": speed}, ) response.raise_for_status() filepath = f"{output_dir}/part_{i+1:03d}.wav" Path(filepath).write_bytes(response.content) duration = int(response.headers.get("X-Audio-Duration-Ms", 0)) total_duration_ms += duration paths.append(filepath) print(f" Part {i+1}/{len(chunks)}: {duration}ms ({len(chunk)} chars)") print(f"\nTotal: {len(chunks)} parts, {total_duration_ms / 1000:.1f}s audio") return paths # Usage chapter_text = """ Once upon a time, in a land far away, there lived a wise old wizard. He spent his days studying ancient texts and brewing magical potions. One morning, a young traveler arrived at his doorstep seeking guidance. The wizard looked at the traveler and smiled knowingly. """.strip() paths = generate_audiobook_chapter(chapter_text, "audiobook/chapter_01", voice="bm_george") ``` ### Python: FastAPI TTS Proxy Service ```python from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field import httpx app = FastAPI(title="TTS Proxy Service") BRAINIALL_KEY = "YOUR_KEY" TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts" class SynthesizeRequest(BaseModel): text: str = Field(..., min_length=1, max_length=5000) voice: str = Field(default="af_heart") speed: float = Field(default=1.0, ge=0.25, le=4.0) @app.post("/api/tts/synthesize") async def synthesize(req: SynthesizeRequest): """Proxy TTS synthesis with validation and caching headers.""" async with httpx.AsyncClient(timeout=30.0) as client: response = await client.post( f"{TTS_URL}/synthesize", headers={"Ocp-Apim-Subscription-Key": BRAINIALL_KEY}, json=req.model_dump(), ) if response.status_code != 200: raise HTTPException(status_code=response.status_code, detail="TTS synthesis failed") from fastapi.responses import Response return Response( content=response.content, media_type="audio/wav", headers={ "X-Audio-Duration-Ms": response.headers.get("X-Audio-Duration-Ms", "0"), "X-Voice": response.headers.get("X-Voice", req.voice), "Cache-Control": "public, max-age=86400", }, ) @app.get("/api/tts/voices") async def list_voices(): """Get available TTS voices.""" async with httpx.AsyncClient(timeout=10.0) as client: response = await client.get( f"{TTS_URL}/voices", headers={"Ocp-Apim-Subscription-Key": BRAINIALL_KEY}, ) return response.json() ``` ### Python: LLM + TTS Pipeline (Generate then Speak) ```python from openai import OpenAI import requests API_KEY = "YOUR_KEY" client = OpenAI( base_url="https://apim-ai-apis.azure-api.net/v1", api_key=API_KEY ) # Step 1: Generate text with LLM response = client.chat.completions.create( model="claude-haiku-4-5", messages=[{"role": "user", "content": "Write a 2-sentence greeting for a podcast intro."}] ) generated_text = response.choices[0].message.content # Step 2: Convert to speech tts_response = requests.post( "https://apim-ai-apis.azure-api.net/v1/tts/synthesize", headers={"Ocp-Apim-Subscription-Key": API_KEY}, json={"text": generated_text, "voice": "am_michael", "speed": 1.0} ) with open("podcast_intro.wav", "wb") as f: f.write(tts_response.content) print(f"Generated intro: {generated_text}") ``` ### Python: LangChain TTS Tool Integration ```python from langchain_core.tools import tool from langchain_brainiall import ChatBrainiall from langgraph.prebuilt import create_react_agent import requests import base64 API_KEY = "YOUR_KEY" TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts" @tool def text_to_speech(text: str, voice: str = "af_heart", speed: float = 1.0) -> str: """Convert text to speech audio. Returns base64-encoded WAV audio. Available voices: af_heart (warm female), bf_emma (British female), am_michael (professional male), bm_george (British male narrator). Speed range: 0.25 (very slow) to 4.0 (very fast).""" response = requests.post( f"{TTS_URL}/synthesize", headers={"Ocp-Apim-Subscription-Key": API_KEY}, json={"text": text, "voice": voice, "speed": speed}, ) response.raise_for_status() audio_b64 = base64.b64encode(response.content).decode() duration = response.headers.get("X-Audio-Duration-Ms", "?") return f"Audio generated: {duration}ms, {len(response.content)} bytes. Base64: {audio_b64[:50]}..." @tool def list_tts_voices() -> str: """List all available text-to-speech voices with their accents and genders.""" response = requests.get( f"{TTS_URL}/voices", headers={"Ocp-Apim-Subscription-Key": API_KEY}, ) voices = response.json()["voices"] return "\n".join(f"{v['id']}: {v['name']} ({v['gender']}, {v['accent']})" for v in voices) # Create agent with TTS capabilities llm = ChatBrainiall(model="claude-sonnet-4-6", api_key=API_KEY) agent = create_react_agent(llm, [text_to_speech, list_tts_voices]) result = agent.invoke({ "messages": [("human", "List available British voices, then synthesize 'Good morning' with the best male British voice")] }) for msg in result["messages"]: if msg.content: print(f"{msg.type}: {msg.content[:200]}") ``` ### Python: Notification System with TTS ```python import requests from pathlib import Path from datetime import datetime API_KEY = "YOUR_KEY" TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} # Voice assignment by notification type NOTIFICATION_VOICES = { "alert": {"voice": "am_adam", "speed": 1.1}, # Urgent, authoritative "reminder": {"voice": "af_heart", "speed": 0.95}, # Warm, calm "greeting": {"voice": "bf_emma", "speed": 1.0}, # Friendly, British "news": {"voice": "bm_george", "speed": 1.05}, # Documentary style "tutorial": {"voice": "af_nicole", "speed": 0.85}, # Clear, professional } def generate_notification( message: str, notification_type: str = "reminder", output_dir: str = "notifications", ) -> str: """Generate a spoken notification audio file.""" config = NOTIFICATION_VOICES.get(notification_type, NOTIFICATION_VOICES["reminder"]) Path(output_dir).mkdir(exist_ok=True) response = requests.post( f"{TTS_URL}/synthesize", headers=HEADERS, json={"text": message, **config}, ) response.raise_for_status() timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") filepath = f"{output_dir}/{notification_type}_{timestamp}.wav" Path(filepath).write_bytes(response.content) duration = response.headers.get("X-Audio-Duration-Ms", "?") print(f"[{notification_type}] {duration}ms: {message[:60]}...") return filepath # Usage generate_notification("Your meeting starts in 5 minutes.", "alert") generate_notification("Don't forget to review your weekly report.", "reminder") generate_notification("Good morning! Today's weather is sunny with a high of 72.", "greeting") generate_notification("The market closed up 1.2 percent today.", "news") ``` ### JavaScript: Text-to-Speech ```javascript const API_KEY = "YOUR_KEY"; const BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts"; async function synthesize(text, voice = "af_heart", speed = 1.0) { const response = await fetch(`${BASE_URL}/synthesize`, { method: "POST", headers: { "Content-Type": "application/json", "Ocp-Apim-Subscription-Key": API_KEY, }, body: JSON.stringify({ text, voice, speed }), }); const audioBuffer = await response.arrayBuffer(); const duration = response.headers.get("X-Audio-Duration-Ms"); console.log(`Generated ${duration}ms of audio`); return Buffer.from(audioBuffer); } // Generate speech const audio = await synthesize("Hello from Brainiall TTS!", "bf_emma"); // Save to file (Node.js) const fs = await import("fs"); fs.writeFileSync("output.wav", audio); ``` ### JavaScript: Express.js TTS Service ```javascript import express from "express"; const app = express(); app.use(express.json()); const API_KEY = process.env.BRAINIALL_API_KEY || "YOUR_KEY"; const TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts"; app.post("/api/speak", async (req, res) => { const { text, voice = "af_heart", speed = 1.0 } = req.body; if (!text || text.length > 5000) { return res.status(400).json({ error: "Text required, max 5000 chars" }); } try { const response = await fetch(`${TTS_URL}/synthesize`, { method: "POST", headers: { "Content-Type": "application/json", "Ocp-Apim-Subscription-Key": API_KEY, }, body: JSON.stringify({ text, voice, speed }), }); if (!response.ok) throw new Error(`TTS failed: ${response.status}`); const audioBuffer = await response.arrayBuffer(); res.set({ "Content-Type": "audio/wav", "X-Audio-Duration-Ms": response.headers.get("X-Audio-Duration-Ms"), "Cache-Control": "public, max-age=3600", }); res.send(Buffer.from(audioBuffer)); } catch (err) { res.status(500).json({ error: err.message }); } }); app.get("/api/voices", async (req, res) => { const response = await fetch(`${TTS_URL}/voices`, { headers: { "Ocp-Apim-Subscription-Key": API_KEY }, }); res.json(await response.json()); }); app.listen(3000, () => console.log("TTS service on :3000")); ``` ### JavaScript: List Voices and Synthesize ```javascript const API_KEY = "YOUR_KEY"; const BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts"; const headers = { "Ocp-Apim-Subscription-Key": API_KEY }; // Get available voices const voicesResponse = await fetch(`${BASE_URL}/voices`, { headers }); const { voices } = await voicesResponse.json(); console.log("Available voices:"); voices.forEach((v) => { console.log(` ${v.id} - ${v.name} (${v.gender}, ${v.accent})`); }); // Synthesize with each British voice for (const voice of voices.filter((v) => v.accent === "british")) { const response = await fetch(`${BASE_URL}/synthesize`, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify({ text: "Good afternoon. How may I assist you?", voice: voice.id, }), }); const audio = Buffer.from(await response.arrayBuffer()); const fs = await import("fs"); fs.writeFileSync(`${voice.id}.wav`, audio); console.log(`Saved ${voice.id}.wav (${voice.name})`); } ``` ### curl: TTS Examples ```bash API_KEY="YOUR_KEY" BASE="https://apim-ai-apis.azure-api.net/v1/tts" # Synthesize speech (output to file) curl -X POST "$BASE/synthesize" \ -H "Content-Type: application/json" \ -H "Ocp-Apim-Subscription-Key: $API_KEY" \ -d '{"text": "Hello world, this is a text to speech test.", "voice": "af_heart", "speed": 1.0}' \ --output output.wav # List available voices curl -s "$BASE/voices" \ -H "Ocp-Apim-Subscription-Key: $API_KEY" | python3 -m json.tool # British male voice at slow speed curl -X POST "$BASE/synthesize" \ -H "Content-Type: application/json" \ -H "Ocp-Apim-Subscription-Key: $API_KEY" \ -d '{"text": "The weather in London is quite pleasant today.", "voice": "bm_george", "speed": 0.8}' \ --output british_slow.wav # Health check curl -s "$BASE/health" \ -H "Ocp-Apim-Subscription-Key: $API_KEY" | python3 -m json.tool ``` ## Use Cases ### Language Learning Platform Generate practice audio at slower speeds for students learning English pronunciation: ```python import requests API_KEY = "YOUR_KEY" BASE_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} LESSONS = { "greetings": [ ("Hello, how are you?", "af_heart", 0.7), ("Good morning, nice to meet you.", "bm_george", 0.7), ("My name is Sarah. What is your name?", "af_sarah", 0.7), ], "directions": [ ("Turn left at the next intersection.", "am_adam", 0.8), ("The restaurant is on the right side of the street.", "bf_emma", 0.8), ("Go straight ahead for two blocks.", "bm_lewis", 0.8), ], } for lesson_name, phrases in LESSONS.items(): for i, (text, voice, speed) in enumerate(phrases): response = requests.post( f"{BASE_URL}/synthesize", headers=HEADERS, json={"text": text, "voice": voice, "speed": speed}, ) filename = f"lessons/{lesson_name}_{i+1:02d}.wav" with open(filename, "wb") as f: f.write(response.content) print(f" {filename}: {text}") ``` ### IVR / Phone System Generate consistent voice prompts for interactive voice response systems: ```python import requests API_KEY = "YOUR_KEY" TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} IVR_PROMPTS = { "welcome": "Thank you for calling. For English, press one. Para español, presione dos.", "main_menu": "For sales, press one. For support, press two. For billing, press three. To speak with an operator, press zero.", "hold": "Please hold. Your call is important to us. An agent will be with you shortly.", "voicemail": "We're sorry, no one is available to take your call. Please leave a message after the tone.", "hours": "Our office hours are Monday through Friday, 9 AM to 5 PM Eastern Time.", "goodbye": "Thank you for calling. Have a great day. Goodbye.", } for name, text in IVR_PROMPTS.items(): response = requests.post( f"{TTS_URL}/synthesize", headers=HEADERS, json={"text": text, "voice": "af_nicole", "speed": 0.95}, ) with open(f"ivr/{name}.wav", "wb") as f: f.write(response.content) print(f" {name}.wav ({response.headers.get('X-Audio-Duration-Ms')}ms)") ``` ### Accessibility / Screen Reader Alternative Convert web page content to spoken audio for visually impaired users: ```python import requests API_KEY = "YOUR_KEY" TTS_URL = "https://apim-ai-apis.azure-api.net/v1/tts" HEADERS = {"Ocp-Apim-Subscription-Key": API_KEY} def text_to_audio(content: str, voice: str = "am_michael") -> bytes: """Convert text content to audio for accessibility.""" response = requests.post( f"{TTS_URL}/synthesize", headers=HEADERS, json={"text": content[:5000], "voice": voice, "speed": 0.9}, ) response.raise_for_status() return response.content # Example: convert article sections to audio article = { "title": "New Study Shows Benefits of Daily Exercise", "summary": "Researchers at Stanford University found that just 30 minutes of daily exercise can reduce the risk of heart disease by up to 40 percent.", "details": "The study tracked 10,000 participants over five years, measuring cardiovascular health markers quarterly.", } for section, text in article.items(): audio = text_to_audio(text) with open(f"article_{section}.wav", "wb") as f: f.write(audio) ``` ## Migration Guides ### Migrating from ElevenLabs ElevenLabs uses a different endpoint structure. Here is a side-by-side comparison: ```python # BEFORE: ElevenLabs ($0.18/1K chars) import requests response = requests.post( "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM", headers={ "xi-api-key": "YOUR_ELEVEN_KEY", "Content-Type": "application/json", }, json={ "text": "Hello world", "model_id": "eleven_turbo_v2_5", "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}, }, ) audio = response.content # AFTER: Brainiall ($0.01-0.03/1K chars) — 6-18x cheaper response = requests.post( "https://apim-ai-apis.azure-api.net/v1/tts/synthesize", headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"}, json={"text": "Hello world", "voice": "af_heart", "speed": 1.0}, ) audio = response.content ``` Key differences: - Brainiall uses a single endpoint for all voices (voice is a parameter, not URL path) - No separate model_id needed - Speed control instead of stability/similarity_boost - 6-18x lower cost per character - 12 voices (English only) vs ElevenLabs' custom voice cloning ### Migrating from Google Cloud TTS ```python # BEFORE: Google Cloud TTS ($0.004-0.016/1K chars) from google.cloud import texttospeech client = texttospeech.TextToSpeechClient() input_text = texttospeech.SynthesisInput(text="Hello world") voice = texttospeech.VoiceSelectionParams( language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.FEMALE, ) audio_config = texttospeech.AudioConfig( audio_encoding=texttospeech.AudioEncoding.LINEAR16, speaking_rate=1.0, ) response = client.synthesize_speech( input=input_text, voice=voice, audio_config=audio_config ) audio = response.audio_content # AFTER: Brainiall ($0.01-0.03/1K chars) import requests response = requests.post( "https://apim-ai-apis.azure-api.net/v1/tts/synthesize", headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"}, json={"text": "Hello world", "voice": "af_heart", "speed": 1.0}, ) audio = response.content ``` Key differences: - No SDK installation or Google Cloud project setup required - Simple REST API vs client library - Direct WAV output (no audio encoding config) - Comparable pricing with simpler billing ### Migrating from Amazon Polly ```python # BEFORE: Amazon Polly ($0.004-0.016/1K chars) import boto3 polly = boto3.client("polly", region_name="us-east-1") response = polly.synthesize_speech( Text="Hello world", OutputFormat="pcm", VoiceId="Joanna", Engine="neural", SampleRate="24000", ) audio = response["AudioStream"].read() # AFTER: Brainiall ($0.01-0.03/1K chars) import requests response = requests.post( "https://apim-ai-apis.azure-api.net/v1/tts/synthesize", headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"}, json={"text": "Hello world", "voice": "af_heart", "speed": 1.0}, ) audio = response.content ``` Key differences: - No complex SDK or credentials setup needed - Simple REST API with single API key - Single API key authentication - No region-specific endpoints ## MCP Server ### Configuration (Claude Desktop / Cursor / Cline) ```json { "mcpServers": { "brainiall-speech": { "url": "https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp", "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY", "Accept": "application/json, text/event-stream" } } } } ``` TTS tools available via the Speech AI MCP server: - `synthesize_speech`: Convert text to speech audio - `list_tts_voices`: Get all available voices with metadata - `check_tts_service`: Health check for TTS service ## Voice Comparison Guide | Voice ID | Name | Gender | Accent | Best For | |----------|------|--------|--------|----------| | af_heart | Heart | Female | American | General purpose, warm tone | | af_bella | Bella | Female | American | Conversational, friendly | | af_nicole | Nicole | Female | American | Professional, clear | | af_sarah | Sarah | Female | American | Narration, storytelling | | af_sky | Sky | Female | American | Young, energetic | | am_adam | Adam | Male | American | News, announcements | | am_michael | Michael | Male | American | Professional, authoritative | | bf_emma | Emma | Female | British | Education, tutorials | | bf_isabella | Isabella | Female | British | Elegant, formal | | bm_george | George | Male | British | Documentary, narration | | bm_lewis | Lewis | Male | British | Casual, approachable | | bm_daniel | Daniel | Male | British | Professional, clear | ## Pricing | Tier | Price per 1K characters | Monthly cost (1M chars) | |------|------------------------|------------------------| | Standard | $0.01-0.03 | $10-30 | Compare with competitors: - ElevenLabs: $0.12-0.20 per 1K chars - Google Cloud TTS: $0.004 per 1K chars (Standard), $0.016 (Neural) - Amazon Polly: $0.004 per 1K chars (Standard), $0.016 (Neural) - Azure Speech: $0.016 per 1K chars (Neural) - OpenAI TTS: $0.015 per 1K chars ## Technical Details - Engine: Kokoro speech synthesis - Sample rate: 24kHz - Bit depth: 16-bit PCM - Channels: Mono - Format: WAV (RIFF) - Max text length: 5,000 characters per request - Speed range: 0.25x to 4.0x - Latency: Sub-1 second on GPU (NVIDIA A10) - Languages: English (American and British accents) ## Links - Website: https://brainiall.com - Get API Key: https://brainiall.com - Speech AI Examples: https://github.com/fasuizu-br/speech-ai-examples - NLP API: https://github.com/fasuizu-br/brainiall-nlp-api - Image API: https://github.com/fasuizu-br/brainiall-image-api - LLM Gateway: https://github.com/fasuizu-br/brainiall-llm-gateway