--- title: Things I Learned - 21 Jan 2024 date: 2024-01-21T00:00:00+00:00 categories: - til description: I compared quantized Mistral outputs, tuned ElevenLabs voice cloning, and explored Tim Ferriss’s writing constraints. I also learned about Lilac for data curation and why lungs have a high Hausdorff dimension of 2.97. keywords: [mistral, quantization, elevenlabs, lilac ml, hausdorff dimension, tim ferriss, writing, speech synthesis] --- This week, I learned: - When comparing Mistral with 4b quantization vs unquantized: - 2 responses were significantly shorter and fairly different - 1 was identical - 1 was almost identical but shorter by a few words - 1 was slightly longer and fairly different - #PREDICTION As humans have more conversations with LLMs, they will replace video watching and interactive gaming with conversation based role play. New game genres will evolve - [Lilac](https://www.lilacml.com/) is an LLM-based data curation tool. Use it to search by concept (e.g. PII, duplicates, etc.) and then drop/update the results. - Lungs have a Hausdorff dimension of 2.97 -- giving them one of the highest surface area to volume ratio. Brains are 2.8. Sierpinski Pyramid is exactly 2 -- which is weird. To solid-paint twice the size, you need 4 times as much paint. - [How I write podcast. Tim Ferriss](https://youtu.be/rXUuStdMeoE) - High bars are constraints. I set the strongest constraints against the scarcest resources. Like reputation - Being a category of one is more defensible than a competitive advantage - Content always beats presentation. When in doubt, push for more interesting content - Regular publishing improves thinking - To build a habit, do less than you think you can do. That makes it easier to build momentum on the habit and sustain during crunch times - There is a lot of mediocrity in the world. If you're doing something (in a winner take all ecosystem), be the best. - Top lawyers are exceptional proofreaders. They are able to see what is unclair, and what is redundant, and what has loop holes very quickly. - Forcing yourself to cut down from a thousand words to 200 to a paragraph to a sentence takes you through a phase transition where you discover something unexpected - The more outrageous the question, the more likely it is to be useful in generating a new perspective - [Eleven-labs speech synthesis](https://elevenlabs.io/speech-synthesis) with voice cloning is at the uncanny valley. With two 5-minute samples, my voice sounds a fair bit like my voice but is very clearly not my voice. I find stability ~ 30%, similarity ~ 80% and style ~50% gives a reasonable outcome. But the default voices (e.g. Joseph, George, Charlie) are excellent. - Practical AI podcast: AI predictions for - AI by API is the norm today and will grow - Just having AI is no longer a differentiator - AI is part of life, not just work - #TODO Explore quickdrop from Stability for Maruti - #TODO Explore Codium VS Code plugin and Continue.dev - Hybrid systems that combine stats, ML, DL and AI models will grow - AGI and AutoGPT resurgence - RAG will continue to be a focus - GPT4 will be beaten by open source models. Special purpose models beat it already - Self hosted and cloud hosted models will grow for security - Small language models will grow - Productivity will be enhanced rather than replaced - Multi modal models will grow - Cost efficiency will grow in focus - [GPT Builder help](https://help.openai.com/en/articles/8770868-gpt-builder) explains how the GPT Builder updates GPTs - including some very interesting prompts