---
title: Things I Learned - 29 Sep 2024
date: 2024-09-29T00:00:00+00:00
categories:
  - til
description: I explored using Pyodide for browser DOM access and PyMuPDF4LLM for converting PDFs to Markdown. I also compared AVIF to GIF compression, experimented with Opus audio encoding, and researched Anthropic’s contextual retrieval methods for improved RAG performance.
keywords: [pyodide, avif, opus, pymupdf4llm, contextual retrieval, ffmpeg, rag, sentient]
---

This week, I learned:

- Pyodide can access the DOM and JavaScript in the browser
- [Jupyter Lite](https://jupyter.org/try-jupyter/lab/) lets you run Jupyter notebooks in the browser
- AVIFs is about 10X better than GIFs. I tried creating one via [EZGIF AVIF Maker](https://ezgif.com/avif-maker/) and the .avifs file created was 15X smaller!
  `ffmpeg -i input.gif -c:v libaom-av1 -crf 30 -b:v 0 -cpu-used 4 -tiles -an output.avif`
- Claude 3.5 thinks `.opus` is the best format to compress audio. It used `ffmpeg -i audio.wav -c:a libopus -b:a 16k -application voip -vbr on -compression_level 10 audio.opus`
- API coding best practices [Source](https://erikbern.com/2024/09/27/its-hard-to-write-code-for-humans.html) via [Simon Willison](https://simonwillison.net/2024/Sep/27/erik-bernhardsson/):
  - Always add screenshots to the Readme. They never break.
  - Always add every example. Human think in examples.
  - Avoid defaults and be explicit unless 99% of the usage is with the default.
  - Make the feedback loops incredibly fast.
  - Make deprecations easy for users to deal with.
  - Keep objects immutable.
- [PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) can convert PDFs to Markdown. It handles tables, too.
  - 04 Oct 2024. [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) does PDF layout, formula, table, and OCR extraction using various models.
  - 04 Oct 2024. [llmsherpa](https://github.com/nlmatics/llmsherpa) extracts PDF layout, tables, not OCR
- When evaluating feasibility of technology with LLMs always ask for multiple options and pick from those. [Simon Willison](https://youtu.be/6U_Zk_PZ6Kg?t=4444)
- [Gemini supports audio natively](https://ai.google.dev/gemini-api/docs/audio?lang=rest)
- Google Vertex AI has an [OpenAI compatible API](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library) but it works only for some models. Anthropic and Gemini are not compatible.
- When you paste HTML into Excel, it automatically changes the font of the cell to match the content in the HTML!
- Aptos is the new default font in Office - replacing Calibri.
- Anthropic's [Introducing Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval) says:
  - Use BM25 in addition to embeddings to match rare terms (e.g. identifiers)
  - Add a context to each chunk's metadata (generate it with a cheap LLM) and pass it to the summarizing LLM
  - Reranking helps with cost AND accuracy. Use [Cohere](https://cohere.com/rerank) or [Voyage](https://docs.voyageai.com/docs/reranker)
- [Sentient](https://github.com/sentient-engineering/sentient) lets you control the browser via Python in natural language