--- title: Python BYO LLM with Anam TTS and avatar description: "Bring your own LLM to Anam with a Python backend — connect any language model while using Anam's TTS and avatar." tags: [python, custom-llm, tts] date: 2026-02-24 authors: [sebvanleuven] --- When you run your own LLM, you need Anam to handle only TTS and avatar, not the full pipeline. Set `llm_id="CUSTOMER_CLIENT_V1"` to disable Anam's LLM in the orchestration layer. You send your LLM's output via `talk_stream.send()`, and Anam converts it to speech and renders the avatar. This recipe focuses on **PersonaConfig** with `llm_id=CUSTOMER_CLIENT_V1`, sending example LLM output via `create_talk_stream()` and `talk_stream.send()`, and interruption handling with `TALK_STREAM_INTERRUPTED` callback. The complete code is at [examples/python-byo-llm](https://github.com/anam-org/anam-cookbook/tree/main/examples/python-byo-llm). ## What you'll build A Python script that: - Uses **PersonaConfig** with `llm_id="CUSTOMER_CLIENT_V1"` (disables Anam's LLM) - Connects with `connect_async()` and disables session recordings - Sends your LLM's output via `create_talk_stream()` and `talk_stream.send()` on the `TalkMessageStream` - Handles interruptions with `TALK_STREAM_INTERRUPTED` callback - Displays the avatar and plays audio The script reads LLM output from a file, one text chunk per line. It adds a 450ms delay between chunks to simulate real-time LLM streaming. ## Prerequisites - Python 3.10+ - [uv](https://docs.astral.sh/uv/) - Anam API key from [lab.anam.ai](https://lab.anam.ai) - Avatar and voice IDs from [lab.anam.ai](https://lab.anam.ai) ## Disabling Anam's LLM: CUSTOMER_CLIENT_V1 To use your own LLM, you must disable Anam's built-in LLM. Set `llm_id="CUSTOMER_CLIENT_V1"` in PersonaConfig. This tells Anam's orchestration layer that the LLM is provided by the customer—Anam will not run its own LLM. You send your LLM's output via `talk_stream.send()`, which goes directly to TTS. ```python from anam.types import PersonaConfig persona_config = PersonaConfig( avatar_id="your-avatar-id", voice_id="your-voice-id", llm_id="CUSTOMER_CLIENT_V1", # Required: disables Anam's LLM enable_audio_passthrough=False, ) ``` **Why this is required:** Without `CUSTOMER_CLIENT_V1`, Anam would run its own LLM. This will create its own stream of LLM output and create additional TTS segments. This interferes with your LLM output and conversation context and will result in poor user experience. ## Connecting with connect_async and disabling session recordings Use `connect_async()` instead of `connect()` when you need to pass session options. Set `enable_session_replay=False` to disable session recordings. ```python from anam import AnamClient, AnamEvent, ClientOptions from anam.types import SessionOptions client = AnamClient( api_key=api_key, persona_config=persona_config, options=ClientOptions(), ) session_options = SessionOptions(enable_session_replay=False) session = await client.connect_async(session_options=session_options) try: # ... use session finally: await session.close() ``` ## Sending your LLM's output Wait for `SESSION_READY` before sending chunks. Create a `TalkMessageStream` with `create_talk_stream()` and send text chunks with `talk_stream.send()`. The stream manages correlation IDs internally for interruption handling. Set `end_of_speech=True` on the final chunk: ```python talk_stream = session.create_talk_stream() for i, text in enumerate(chunks): await talk_stream.send(text, end_of_speech=(i == len(chunks) - 1)) ``` If your LLM streams chunks without a clear "last chunk" signal (e.g. consuming async iterators), call `talk_stream.end()` when done to signal end of speech. Register `TALK_STREAM_INTERRUPTED` callback to handle interruption events. Flush any remaining text in the buffer/response and create a new `TalkMessageStream` when the user interrupts. A new TalkMessageStream is required to create a new correlation_id so that the new LLM output is mapped on the new turn. ```python @client.on(AnamEvent.TALK_STREAM_INTERRUPTED) async def on_talk_stream_interrupted(correlation_id: str | None) -> None: print(f"Application level talk stream interruption handling for: {correlation_id}") global talk_stream # Flush the LLM output buffer to avoid stale output being sent llm_output_buffer.clear() # Create a new talk stream for the new turn talk_stream = session.create_talk_stream() follow_up = "Okay, interrupted. What else can I help you with today?" await talk_stream.send(follow_up, end_of_speech=True) ``` For a single message, you can use `session.send_talk_stream(content)` as a convenience method, it creates a stream, sends, and ends in one call. However, this is discouraged for streaming LLM output due to the overhead and complexity around interrupt handling. ## Project setup ```bash git clone https://github.com/anam-org/anam-cookbook.git cd anam-cookbook/examples/python-byo-llm uv sync cp .env.example .env ``` Edit `.env`: ```bash ANAM_API_KEY=your_key ANAM_AVATAR_ID=your_avatar_id ANAM_VOICE_ID=your_voice_id ``` ## Running the script ```bash uv run python main.py # uses llm_output_sample.txt uv run python main.py path/to/chunks.txt # custom file (one text chunk per line) ``` Press `q` in the video window to quit, `i` to interrupt the avatar. ## Terminology - **Avatar** – Just the visual character - **TTS** – Text-to-speech engine - **LLM** – Language model With `CUSTOMER_CLIENT_V1`, you provide the LLM. Anam provides TTS and avatar—a single pipeline from your text to lip-synced video.