--- title: Things I Learned - 29 Dec 2024 date: 2024-12-29T00:00:00+00:00 categories: - til description: I explored using LLMs for educational games and market research, compared log storage costs across cloud providers, and tested document conversion tools. I also found that NumPy often outperforms HNSW indexing for similarity searches on datasets under 1M vectors. keywords: [llm, cloudflare, numpy, duckdb, vector search, software agents, prompt engineering, document conversion] --- This week, I learned: - A clever idea. Give an LLM a chapter from a textbook. Ask it to generate a unique, playable game to help me learn theconcepts for an exam. [Page Bailey](https://www.linkedin.com/feed/update/urn:li:activity:7278124663048695809/) - What would be the cost of storing about 500GB of LLM cache logs and 5 million write requests per month? - CloudFlare KV: $250 + $25 / month [Ref](https://developers.cloudflare.com/kv/platform/pricing/) - MongoDB: $125 + $5 / month [Ref](https://www.mongodb.com/pricing) - S3: $0.0115 + $25 / month [Ref](https://aws.amazon.com/s3/pricing/) + ? - CloudFlare R2: $0.0075 + $22.5 / month [Ref](https://developers.cloudflare.com/r2/pricing/) - Satya Nadella prepares for meetings by asking Copilot to tell him everything he needs to know about the client from the CRM, emails, meeting transcripts etc. He shares that colleagues who annotate it further for him. That's using AI for reasoning _and_ collaborating with colleagues. [Satya Nadella | BG2 w/ Bill Gurley & Brad Gerstner](https://youtu.be/9NtsnzRFJ_o?si=0oynYlHPb90TaACD&t=3254) - WOW. This is how a software agent will work alongside humans: [Fix issue #5478: Add color to the line next to "Ran a XXX Command" based on return value](https://github.com/All-Hands-AI/OpenHands/pull/5483) - using [@openhands-agent](https://github.com/openhands-agent). - [aisuite](https://github.com/andrewyng/aisuite) by Andrew Ng is a unified interface to LLMs. Sort of like an `openai` library across multiple providers. - Learnings from [Best of 2024 in Agents (from #1 on SWE-Bench Full, Prof. Graham Neubig of OpenHands/AllHands)](https://youtu.be/B6PKVZq2qqo) - Passing code execution as a tool is more powerful than granular tools. You combine multiple tools and tool calls into one. You move code to the data rather than the other way around. Mostly, you need bash, Python (or Jupyter), file manager, web browser. - UI: Go where the user is, instead of bringing them to you. - A remote runtime is a critical component. - Claude 3.5 Sonnet (20241022) and Claude 3.5 Haiku (20241022) perform best on SWE Bench, followed by Deepseek V3, then O1 2024-12-17. [X](https://x.com/xingyaow_/status/1872145835699691675) - Browsers support SVG favicons as data URLs. So I used this SVG (generated by Claude via `Generate a simple, interesting SVG favicon. Keep the SVG size VERY small but it should be inspiring.`) - Since HNSW indexing is an overhead, just use NumPy matrix multiplication to calculate cosine similarity. For 1M vectors, it takes ~0.05 seconds. A 1M vector dataset handles ~2GB of text at a chunk size of 2K chars. In short, if you're embedding <2GB of text, just use NumPy. - DuckDB's VSS extension HNSW index + Embeddings (2K chunks of 512 dimensions) takes up roughly 2.5X the size of the original data. Embedding 554 files of ~4,456 KB took 710 seconds. Creating the index took 660 seconds. The resulting DB was 18.1 MB. - How to use [LLMs in market research](https://businessmeetsai.substack.com/p/market-research-meets-ai-the-3-step). - Use LLMs with search for secondary research. - Create different personas and run user surveys on them. [This paper used 1,052 real-life interview audio transcripts as agent memory to simulate people](https://x.com/emollick/status/1858664562750374139) - Generate your market research report using LLMs. - Given about 30 generations, Llama 1b outperforms Llama 8b. [Ref](https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute) - OpenAI introduced a `developer` role in addition to the `system` role. This is mainly for `o1`. The API is backward compatible - and also forward compatible. [OpenAI](https://community.openai.com/t/how-is-developer-message-better-than-system-prompt/1062784) - Em dashes are a strong sign of ChatGPT use. Curly quotes too. [Reddit](https://www.reddit.com/r/ApplyingToCollege/comments/1h0vhlq/in_the_past_three_days_ive_reviewed_over_100/) - CloudFlare has multiple [SSL modes](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/) when proxying requests. - [Off (no encryption)](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/off/): No encryption between browsers and Cloudflare or between Cloudflare and origins. Everything is cleartext HTTP. - [Flexible](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/flexible/): Browsers to Cloudflare is HTTPS, Cloudflare to origin is HTTP. Useful to set up CloudFlare as a HTTP Proxy. - [Full](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/full/): Browser to Cloudflare matches browser request. Same protocol is used for Cloudflare to origin, without validating the origin’s certificate. Use for self-signed or otherwise invalid certificates. - [Full (strict)](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/full-strict/): Similar to Full Mode, but with validation. - [Strict (SSL-Only Origin Pull)](https://developers.cloudflare.com/ssl/origin-configuration/ssl-modes/ssl-only-origin-pull/): Cloudflare always connects to the origin over HTTPS with certificate validation. - Getting this wrong can lead to a [HTTP 526: invalid SSL certificate](https://developers.cloudflare.com/support/troubleshooting/cloudflare-errors/troubleshooting-cloudflare-5xx-errors/#error-526-invalid-ssl-certificate) - Medical coding is an area ripe for LLMs. - [Ojasvi Yadav created a repo](https://github.com/ojasviyadav/medical-coding-agent/) that uses hierarchical classification (rather than embeddings) to find the right coding. - Gemini models seem to understand medical terms better than others. - RapidClaims, funded by TogetherAI, is apparently working on this problem. - Document to Markdown Converters: - [PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) uses [MuPDF](https://mupdf.com/). Requires PyTorch. - `PYTHONUTF8=1 uv run --with pymupdf4llm python -c 'import pymupdf4llm; h = open("pymupdf4llm.md", "w"); h.write(pymupdf4llm.to_markdown("$FILE.pdf"))'` - [markitdown](https://github.com/microsoft/markitdown) from Microsoft. PDF via PDFMiner, DOCX via Mammoth, XLSX via Pandas, PPTX via Python-PPTD, ZIP, etc. - `PYTHONUTF8=1 uvx markitdown $FILE.pdf > markitdown.md` - [Docling](https://github.com/DS4SD/docling) by IBM. Unable to install via pip on Windows AND on Linux. - [MegaParse](https://github.com/QuivrHQ/MegaParse) uses libreoffice, pandoc, tesseract-ocr, etc. Requires OpenAI API key. - [Awesome Tabular LLMs](https://github.com/SpursGoZmy/Awesome-Tabular-LLMs) compiles encodings of tables for LLMs. - [What's the best way of encoding tabular data for LLMs?](https://typeset.io/search?q=What%27s%20the%20best%20way%20of%20encoding%20tabular%20data%20for%20LLMs%3F) Looks like including the cell address helps. [Here is an explanation from ChatGPT](https://chatgpt.com/share/6768c852-3bd4-800c-a4c7-0e4692a49afd) - [aspose-words](https://pypi.org/project/aspose-words/) is a Python library that converts documents with many formats (Word, RTF, PDF, HTML, Markdown, EPUB, etc.) - Discourse does not support searching across multiple forums. Instead, search for the term in all forums. [Example](https://discourse.onlinedegree.iitm.ac.in/search?q=TDS). Then scroll through the results. Then, in the console, hide the ones you don't want. Example: - Hide posts that are not in the "Tools in Data Science" category: `$(".badge-category__name").filter(d => d.textContent == "Tools in Data Science").map(d => d.closest(".fps-result")).filter(d => d).forEach(d => d.style.display = "none")` - [How are software engineers are future-proofing their careers in the face of LLMs?](https://news.ycombinator.com/item?id=42431103) - Leveraging LLMs as Force Multipliers - Use LLMs for repetitive tasks, rapid prototyping, exploring multiple approaches, data extraction and brainstorming, providing feedback. - Explore prompting techniques, integrate LLMs into their workflows, and develop strategies for validating and refining LLM-generated code - Focusing on higher-level skills that llms struggle with - Systems Thinking and Architecture: code readability, extensibility, testability, and maintainability - Problem Solving and Critical Thinking: define problems clearly, break them down into manageable parts, and reason through complex scenarios. LLMs produce plausibly incorrect code. - Communication and Collaboration - Domain Expertise - Exploring Adjacent Roles: product management, technical leadership, or consulting. Involve more interaction with clients and stakeholders. - Developing "Evergreen" Skills: debugging, system administration, and security. Or outside of software engineering, such as trades or other hands-on vocations. - Scepticism: LLMs may not reach a level of sophistication that would render their expertise obsolete. Complex problems, understanding context, and producing high-quality, maintainable code. - [Examples of agentic AI](https://news.ycombinator.com/item?id=42431361) - Text-to-SQL automated business analyst: A system that generates SQL queries from natural language, handles errors, creates visualizations, and includes a FAQ component. The author calls it "constrained agentic AI." - Data source querying system: A bot that queries multiple SQL and API data sources, selecting tools and reformulating tasks as needed. - Cursor (agentic mode): An LLM-powered VS Code fork that chains together various LLM capabilities (code generation, applying changes, linting suggestions, terminal commands, codebase RAG) to reduce user prompts. - Vulnerability finding system: A system that uses LLM agents to discover novel vulnerabilities in open-source web applications. The agents leave traces of their actions. - Marketing strategy generation system: A system using approximately 60 agents to generate marketing strategies. - Restaurant finder: A system that searches for restaurants based on dietary preferences and group size, and downloads social media information. - Proofreading and editing of transcripts: LLM agents apply specific customer requirements to transcripts after human editing. - Meeting notes and action items generator: A system that generates meeting notes and action items. - O'Reilly auto parts customer service agent: An agent demonstrated using RAG. - UI enhancement agent: An agent that added features like language locales and dark mode to a UI.