--- title: Things I Learned - 18 May 2025 date: 2025-05-18T00:00:00+00:00 categories: - til description: I explored storage options for data under 1GB, from GitHub Releases to MotherDuck. I also learned about encrypted LLM inference, Pandoc extensions for Markdown, and why you should always schedule data deletions instead of doing them live. keywords: [github releases, motherduck, pandoc, openalex, encrypted inference, uv, bootstrap, systemd] --- This week, I learned: - Birds navigate using quantum entanglement! [Guardian](https://www.theguardian.com/science/2025/mar/23/they-have-no-one-to-follow-how-migrating-birds-use-quantum-mechanics-to-navigate) [ChatGPT](https://chatgpt.com/share/68282f03-3978-800c-8e46-e9979887317d) - [DeerFlow](https://github.com/bytedance/deer-flow) is an open source Deep Research MCP. Lets you run deep research outside of the standard chatbots. - β Today, if I had to store a bunch of data files (e.g. parquet) under 1GB, I would use GitHub Releases. Here are options: - **GitHub Releases**. 2 GiB **per file**, unlimited total & bandwidth. π’ Immortal URL, versioning, easy CI publish. π΄ Each file must stay < 2 GiB; no built-in SQL. - **Zenodo** (CERN). 50 GB per record; one-off bumps to 200 GB. π’ DOI assignment, archival mandate. π΄ Occasional throttled bandwidth; no API for partial file reads. - **Hugging Face Hub**. 300 GB per repo; 50 GB per file. π’ Git-based, dataset tooling, lively ML community. π΄ Large files need git-LFS; pushes via LFS can be slow. - **Cloudflare R2**. 10 GB storage & 1 M ops / month. π’ S3 API, zero-egress to Cloudflare Workers, fast. π΄ 10 GB cap below your 50 GB target. - **Kaggle Datasets**. 20 GB per dataset, public only. π’ Built-in notebooks & GPU. π΄ No programmatic SQL API; quotas sometimes change. - **data.world (free)**. 1 GB total, 100 MB per dataset. π’ Nice social features. π΄ Too small for your size. - If I had to query a bunch of data files in an external Parquet or SQLite file, here are SQL engines-as-a-service: - **MotherDuck**. 10 GB storage + 10 CU-hrs/mo compute. Native DuckDB; no credit card; GA June 2024; monthly feature drops. - **Datasette Cloud**. Two-month trial (or 1-yr for non-profits). SQLite backend. Great UX; but not free forever for general use. - **AWS Athena**. Pay-per-TB scanned; no free tier; S3 fees after 12 mo. Costs creep quickly; free-tier S3 ends after a year. - Bootstrap has a [`.stretched-link`](https://getbootstrap.com/docs/5.3/helpers/stretched-link/) that makes a link cover the containing block. A clever trick that I discovered when Claude 3.5 Sonnet wrote [my code](https://github.com/sanand0/sanand0.github.io/blob/0932f2efe3ad6c950c20b2ed7534ef27d8fff304/update.js#L62). - Discovered spray and peel paints at [ArtFriend](https://artfriendonline.com/). I had no idea that was a thing. - [Gemini Live API](https://ai.google.dev/gemini-api/docs/live) is the real-time equivalent from Gemini. It supports tools, search, and code execution. - [mcp-mem0](https://github.com/coleam00/mcp-mem0) is an MCP for memory - [llm-min.txt](https://github.com/marv1nnnnn/llm-min.txt) compresses docs for LLMs to read optimally. Like a compressed [llms.txt](https://llmstxt.org/) or [context7](https://context7.com/). Usage `GEMINI_API_KEY=... uvx llm-min -i $DIR` #ai-coding - There's a lot of action on encrypted LLM operations. - Responses API allows reasoning tokens to be encrypted if organizations don't want their reasoning data to persist. [Ref](https://cookbook.openai.com/examples/responses_api/reasoning_items) - [Tinfoil](https://tinfoil.sh/) (YC X25) offers an OpenAI-compatible inference API where data is encrypted from the client to the NVIDIA Hopper/Blackwell GPUs in confidential computing mode. Prompts, model weights, outputs are encrypted in transit and memory, with verifiable privacy on code running in GPU. - [Modelyo](https://modelyo.com/) (Israel) offers VMs/K8 clusters with encrypted GPUs across multiple cloud providers with continuous attestation, managed on Modelyo's portal. - β LLMs are able to do things independently longer and longer. That's a useful metric to track. [METR: Measuring AI Ability to Complete Long Tasks](https://www.lesswrong.com/posts/deesrjitvXM4xYGZd/metr-measuring-ai-ability-to-complete-long-tasks). - If you're looking for datasets / APIs related to research publications (especially funding), then explore: - Crossref [API](https://api.crossref.org/swagger-ui/index.html) and [snapshots](https://www.crossref.org/documentation/retrieve-metadata/rest-api/tips-for-using-public-data-files-and-plus-snapshots/) - OpenAlex [API](https://docs.openalex.org/) and [snapshots](https://docs.openalex.org/download-all-data/openalex-snapshot) which is funded by [OurResearch](https://ourresearch.org/). OpenAlex is like CrossRef but includes some disambiguation - [OpenAIRE Graph](https://graph.openaire.eu/docs/category/downloads/) [2024](https://zenodo.org/records/13133184) / [2025](https://zenodo.org/records/14851262) - [Europe PMC](https://pmc.ncbi.nlm.nih.gov/articles/PMC10767826/) [dataset](https://ftp.ebi.ac.uk/pub/databases/pmc/) - To avoid Ubuntu 24 suspending on closing the laptop lid use one of these and restart: - `/etc/systemd/logind.conf`: Set `HandleLidSwitch=ignore` - `etc/UPower/UPower.conf`: Set `IgnoreLid=true` - `UV_TORCH_BACKEND=auto uv pip install torch torchvision torchaudio` installs the most appropriate PyTorch version. [Ref](https://docs.astral.sh/uv/guides/integration/pytorch/#automatic-backend-selection) - [Cog](https://cog.readthedocs.io/en/latest/) is a Python based templating language. It is embedded as comment chunks in any file and replaced itself with the output of the Python code you write. - [CloudFlare Zero Trust](https://www.cloudflare.com/en-in/zero-trust/products/access/) seems the easiest way to enable auth on static websites, especially if your DNS is already on Cloudflare. No cost - We could "fine-tune" system prompts automatically with evals, creating a "system prompt learning" paradim -- like my [promptevals](https://github.com/gramener/promptevals). [Andrej Karpathy](https://x.com/karpathy/status/1921368644069765486) - I was asked how to improve speed when building an enterprise ChatGPT clone using an API. Here's what I'd suggest, in order: - Streaming. High impact, low effort. - Caching RAG retrieval as well as generation. High impact, low effort. - UI tweaks. Loading / streaming icons and progress hints ()"Retrieving context", "Generating answer", etc.) - Parallelize, if possible - Use model options where available, e.g. speculative decoding, models with higher speed, models with closer CDN, etc. - Shorten prompts - Persistent HTTP/2 Keep-Alive. Low impact, low effort (tweak server settings). - [Cloudflare Vectorize](https://developers.cloudflare.com/vectorize/platform/pricing/), at 768 dimensions / embedding, is free for ~6.5K chunks storage at ~1,000 queries / day. For a light load like 1M 768d chunks queried 1K times a day, the cost is: [ChatGPT](https://chatgpt.com/share/6821a25a-9f80-800c-8d95-8b2200ad6de4) - [NVIDIA parakeet](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2) is a lightweight speech to text model that leads benchmarks. Installing such packages continues to be a nightmare due to PyTorch (despite `uv`). - I explored the real-time avatar space. Heygen seems to be the easiest to use, but even that is complex and expensive ($99/mo). We may need to wait a few months for avatars to explode. - β Model reliability is a huge enabler for performance. As models become more reliable, they can work autonomously for longer and that is another kind of scaling. [Vending Bench](https://andonlabs.com/evals/vending-bench) - ChatGPT, Gemini, etc. have become lead generation engines. Chat Bot Optimization (CBO), is it? [WhatsApp + ChatGPT](https://chatgpt.com/share/68215e14-9870-800c-a8e0-4fe476f48cc5) - β Never live delete data. Mark it for deletion and schedule a deletion task. That way you have time to react to mistakes. [Simon Willison](https://simonwillison.net/2025/May/14/james-cowling/) - [Pandoc](https://pandoc.org/MANUAL.html) has several options useful when converting Markdown to HTML (`cat file.md | pandoc -f markdown -t html`). My favorites: - `--no-highlight` skips code-highlighting. `--highlight=pygments` adds Pygments styling - `--wrap=none` doesn't wrap the content in a single block - `--number-sections` adds section numbering (`