---
title: Voice coding is the new live coding
date: "2025-09-21T11:18:17Z"
lastmod: "2025-09-21T11:20:30Z"
categories:
  - coding
  - llms
wp_id: 4194
description: "A voice-to-code CLI workflow makes LLM-assisted live coding faster, more engaging, and smoother for audiences than traditional typing-heavy demos."
keywords: ["voice coding", "live coding", "CLI", "Gemini", "transcription", "developer workflows"]
---

![Voice coding is the new live coding](/blog/assets/ChatGPT-Image-Sep-21-2025-04_46_27-PM.webp)

In Feb 2025 at PyConf Hyderabad, I tried a new slide format: [command-line slideshows in `bash`](/blog/command-line-slideshows-in-bash/).

I've used this format in more talks since then:

- [LLMs in the CLI](https://github.com/sanand0/talks0/blob/main/2025-06-pycon-sg/llm-cli.md), PyCon Singapore, Jun 2025
- [Agents in the CLI](https://github.com/sanand0/talks/blob/main/2025-07-24-pugs-agent-loop/README.md), Singapore Python User Group, Jul 2025
- [DuckDB is the new Pandas](https://github.com/sanand0/talks/blob/main/2025-09-13-duckdb-is-the-new-pandas/README.md), PyCon India, Sep 2025

It's my favorite format. I can demo code without breaking the presentation flow.\
It also draws interest. My setup was the [top question in my PyCon talk](https://github.com/sanand0/talks/blob/main/2025-09-13-duckdb-is-the-new-pandas/README.md#qa).

In Sep 2025, at PyCon India, I extended the setup for voice typing. [`talkcode.sh`](https://github.com/sanand0/scripts/blob/54560718bf2f4148d9005d74ab1543de52cff6d9/talkcode.sh) is a Bash pipeline that:

- uses [`ffmpeg`](https://ffmpeg.org/) to record mic into as 16 kHz mono, voice-optimized `.opus`
- sends audio to Gemini via [`llm`](https://llm.datasette.io/) for transcription
- uses [`awk`](https://en.wikipedia.org/wiki/AWK) to extract the code fence
- uses [`xclip`](https://github.com/astrand/xclip) to copy the code to the clipboard
- uses [`xdotool`](https://github.com/jordansissel/xdotool) to paste it back to the original window

````bash
# Mic only, 16 kHz mono, voice filtering, fast Opus
ffmpeg -hide_banner -v error \
  -f pulse -i default \
  -ac 1 -ar 16000 \
  -af "highpass=f=100,lowpass=f=6000" \
  -c:a libopus -b:a 16k -vbr on \
  -compression_level 2 -application voip -frame_duration 60 \
  -y "$AUDIO"

# Transcribe, extract the code fence and copy to clipboard
llm -m gemini-2.5-flash -a "$AUDIO" -s "$SYSTEM_TEXT" \
  | tee /dev/tty \
  | awk 'BEGIN{f=0} /```/{f=!f; next} f{buf=buf$0"\n"} END{print buf}' \
  | xclip -selection clipboard

# Bring last window to foreground and paste from clipboard
if [ -n "${ACTIVE_WIN:-}" ]; then
  xdotool windowactivate --sync "$ACTIVE_WIN"
  sleep 0.08
  xdotool key --clearmodifiers ctrl+shift+v
fi
````

During the workshop, I said, "Which (judicial) bench has the longest pending cases," and it generated a DuckDB query that, single-shot, ran correctly on the [Indian High Court judgements](https://github.com/vanga/indian-high-court-judgments) dataset.

---

But LLMs are slow and break the flow. Here's how I keep the room engaged:

1. **Dictate, don't type.** Speaking is faster **and** more engaging. That's why I built this workflow.
2. **Avoid Alt-Tab.** Bring the LLM **into** your app. Window-switching and copy-paste break focus for you **and** the audience.
3. **Always stream output.** Narrate as it loads. `| tee /dev/tty` streams while piping.
4. **Answer questions while you wait.** Keep a [Slido](https://www.slido.com/) Q&A open and address the top ones while waiting for the LLM.

[LinkedIn](https://www.linkedin.com/posts/sanand0_%F0%9D%97%A9%F0%9D%97%BC%F0%9D%97%B6%F0%9D%97%B0%F0%9D%97%B2-%F0%9D%97%B0%F0%9D%97%BC%F0%9D%97%B1%F0%9D%97%B6%F0%9D%97%BB%F0%9D%97%B4-%F0%9D%97%B6%F0%9D%98%80-%F0%9D%98%81%F0%9D%97%B5%F0%9D%97%B2-%F0%9D%97%BB%F0%9D%97%B2%F0%9D%98%84-activity-7376824026595278848-PxgV)