--- title: Expressive Voice Agents with ElevenLabs and Anam Avatars (Client-Side) description: Build a Next.js app where ElevenLabs handles voice intelligence and Anam renders a real-time lip-synced avatar, with streaming transcripts and interruption support. tags: [agents, javascript, nextjs, elevenlabs] difficulty: 'intermediate' sdk: 'javascript' date: 2026-02-17 authors: [robbie-anam] --- ## Overview **A server-side integration is now available.** Unless you have a specific need for client-side audio bridging (e.g. client tools), we strongly recommend the [server-side ElevenLabs agents recipe](/cookbook/elevenlabs-server-side-agents) instead. It's simpler, lower latency, and provides better session monitoring. ElevenLabs' Conversational AI handles voice intelligence — speech recognition, LLM reasoning, and text-to-speech with expressive intonation. Anam renders a real-time lip-synced avatar from that audio. This cookbook shows how to bridge the two SDKs in a Next.js app so users talk to a face, not a loading spinner. The full source code is available on [GitHub](https://github.com/robbie-anam/elevenlabs-agent/tree/clientside_version). ## What You'll Build - A Next.js app where users speak into their mic and get a face-to-face response from an AI agent - ElevenLabs handles the full voice pipeline (STT → LLM → TTS) with expressive V3 voices - Anam renders a real-time lip-synced avatar from the generated audio - A streaming transcript that reveals text character-by-character in sync with the avatar's mouth - Interruption support — speak while the agent is talking and the avatar stops mid-sentence - Multiple persona presets — switch between different avatar + agent combinations ## How the Two SDKs Work Together The ElevenLabs SDK captures microphone audio and sends it over a WebSocket. ElevenLabs' cloud runs speech-to-text, passes the transcript to an LLM, and streams synthesized speech back as base64 PCM chunks. Those chunks are forwarded to Anam's `sendAudioChunk()` method, which generates a lip-synced face video delivered over WebRTC. ``` User speaks ↓ ElevenLabs SDK (mic capture) ↓ WebSocket → ElevenLabs Cloud ↓ STT → LLM → TTS base64 PCM chunks ↓ onAudio callback ↓ sendAudioChunk() Anam WebRTC ↓