--- date: "2025-12-25T02:54:41Z" categories: - llms - linkedin description: I discovered Gemini is practically music-deaf, showing zero correlation with human emotional ratings across 40 songs. It over-predicts joy and ignores tension, suggesting it guesses based on text transcriptions rather than actually hearing the audio. keywords: [gemini, music emotion, audio analysis, multimodal llm, affective computing, emotion recognition] --- Gemini can pass the bar exam and solve maths olympiad puzzles. But it's music-deaf. nitin kumar asked Gemini to rate 40 songs on joy, sadness, tension, nostalgia, etc. and compared it with human ratings. There was **ZERO** correlation between the two. It's like it's a different species. In fact, if you just predict the average emotion for every single song, you'd still do 1.2× to 2× better than Gemini! It wasn't adding noise to a signal. It was subtracting **subtracting signal from noise**! In fact, for one song, the correlation was -88%, i.e. it predicted the exact opposite emotions. It's not just noisy, it's biased as well. It hears music as happier, more tender, more powerful than we do. It massively over-predicts "joyful activation" (53% of songs!) and dramatically under-estimates tension. The emotion predictions are suspiciously correlated with each other. Power and joy are basically identical (96% correlation). This confirms a suspicion I had: Gemini can't actually hear the audio. It can transcribe, but beyond that it's just guessing. Human music taggers are safe, for now. Story, prompts & analysis: https://sanand0.github.io/datastories/llm-music/ ![](https://files.s-anand.net/images/2025-12-25-can-ai-hear-what-we-feel-linkedin.jpg) [LinkedIn](https://www.linkedin.com/posts/sanand0_gemini-can-pass-the-bar-exam-and-solve-maths-activity-7409634892520546305-NdNi)