{ "cells": [ { "cell_type": "markdown", "id": "a6233239-1cf1-4eee-affc-d96a0747ee68", "metadata": {}, "source": [ "
Peter Norvig, Aug 2023
\n", "\n", "# One Letter Off: Word Game with Large Language Models\n", "\n", "Professor [**Serge Belongie**](https://en.wikipedia.org/wiki/Serge_Belongie) invented a word game: \n", "- *Pick a word and drop the last letter to form a second word.*\n", "- *Come up with a crossword-puzzle-style clue to help someone guess the resulting two-word phrase.*\n", "\n", "I'm calling the game **One Letter Off**, because that's one of the names that [Bard](https://bard.google.com) suggested when I asked. I'll also introduce a variant of the game where *any* letter in the word can be dropped, not just the last one.\n", "\n", "As an example, pick *board*, drop the last letter to get the phrase *boar board* and write the clue *pig plank.* In the variant, you could drop the letter *o* to get *board bard* and write the clue *pine poet.* (Note the pair of words can be in either order.)\n", "\n", "I thought it would be interesting to write a program to generate interesting word pairs and create clues. My plan is:\n", "\n", "1. Obtain a dictionary of words.\n", "2. Write code to generate all one-letter-off word pairs and sort them so the best ones come first.\n", "3. Inspect some of the pairs and manually write some clues for them, just to get a feel for the task.\n", "4. Write code to prompt a large language model and see what clues it can produce.\n", "\n", "# 1. A dictionary of words, with word vectors\n", "\n", "Given a dictionary, it is easy to find all pairs of words that are one letter off. The tricky part is to decide which are the \"good\" ones. Clearly, pairing the word *race* with *races* or *raced* or *racer* is bad, because they are all just forms of the same root word. The game is interesting when two words are very different in meaning, even though they are only one letter apart. Pairing *quaker* with *quake* is ok, because a Quaker is not just someone who quakes. So I can't rely on simple heuristic rules like \"don't drop a final -r.\" \n", "\n", "I can, however, associate each word with a [**word vector**](https://en.wikipedia.org/wiki/Word_embedding) (also called *word embedding*) and define the goodness of a pair as the distance between the pairs. Multiple research teams have published open-source dictionaries that map words to word vectors. Two words that are similar in meaning or usage have similar word vectors (and so the distance between them is small), but words with distinct usage should have dissimilar word vectors (and a larger distance between them). \n", "\n", "I downloaded a [file of word vectors](http://vectors.nlpl.eu/repository/20/0.zip) from the [NLPL Word Vectors Repository](http://vectors.nlpl.eu/repository/) and truncated it to consider only the [30,000 most common words](model30k.txt), because I didn't want to be giving clues for words that the guesser is unlikely to know. Each line in the word vector file has a word, followed by its part of speech (which we will ignore), followed by a vector of 300 numbers. For example:\n", "\n", " say_VERB -0.008861 0.097097 0.100236 0.070044 -0.079279 0.000923 ...\n", " \n", "I'll read the file into a dict of `{word: vector}` called `vectors`, but I will eliminate words that are too short, or have a non-alphabetic or uppercase character. Also, if a word appears twice with two different parts of speech, I'll use the first one (because the file is sorted with more frequent words first)." ] }, { "cell_type": "code", "execution_count": 1, "id": "bfedfb19", "metadata": {}, "outputs": [], "source": [ "from typing import *\n", "import numpy as np\n", "\n", "def read_vectors(lines: Iterable[str]) -> Dict[str, np.array]:\n", " \"\"\"Read lines of text into a dict of {word: vector} pairs.\"\"\"\n", " vectors = {}\n", " for line in lines:\n", " entry, *numbers = line.split()\n", " word, POS = entry.split('_') # Ignore the part of speech\n", " if word not in vectors and len(word) >= 4 and word.isalpha() and word.islower():\n", " vectors[word] = np.array([float(x) for x in numbers])\n", " return vectors\n", "\n", "vectors = read_vectors(open('model30k.txt'))" ] }, { "cell_type": "code", "execution_count": 2, "id": "efdc80fe", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "18331" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(vectors)" ] }, { "cell_type": "markdown", "id": "8ad74e30", "metadata": {}, "source": [ "There are 18,331 distinct entries, out of the 30,000 lines in the original file.\n", "\n", "The [Euclidean distance](https://en.wikipedia.org/wiki/Euclidean_distance) between two word vectors is the [norm](https://en.wikipedia.org/wiki/Norm_(mathematics)) of their difference:" ] }, { "cell_type": "code", "execution_count": 3, "id": "d02a7b33", "metadata": {}, "outputs": [], "source": [ "def distance(word1, word2, vectors) -> float:\n", " \"\"\"Distance between vectors for the two words.\"\"\"\n", " return np.linalg.norm(vectors[word1] - vectors[word2])" ] }, { "cell_type": "code", "execution_count": 4, "id": "02fa9df6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.3230686663760125" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "distance('quaker', 'quake', vectors)" ] }, { "cell_type": "code", "execution_count": 5, "id": "d40b7470", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7579947920361987" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "distance('smoker', 'smoke', vectors)" ] }, { "cell_type": "markdown", "id": "f2c125d7", "metadata": {}, "source": [ "# 2. Word pairs, best first\n", "\n", "The next step is to form word pairs by considering each word, looking at all ways to drop a letter from each word, and checking if those are words. Then we'll sort them so the \"best\" pairs come first." ] }, { "cell_type": "code", "execution_count": 6, "id": "e2061a00", "metadata": {}, "outputs": [], "source": [ "def sorted_pairs(vectors, drop_fn: callable) -> List[Tuple[str, str]]:\n", " \"\"\"List of (word1, word2) pairs, biggest word-vector distance first.\"\"\"\n", " pairs = [(w1, w2) for w1 in vectors for w2 in drop_fn(w1, vectors)]\n", " pairs.sort(key=lambda pair: distance(*pair, vectors), reverse=True)\n", " return pairs" ] }, { "cell_type": "markdown", "id": "6c65db76", "metadata": {}, "source": [ "Here are the two functions to drop letters (either just the last letter, or any letter) and the function `words`, which checks which of the resulting strings are in fact words." ] }, { "cell_type": "code", "execution_count": 7, "id": "227cbcc7", "metadata": {}, "outputs": [], "source": [ "def drop_last_letter(word, vectors) -> Set[str]: \n", " \"\"\"All ways to drop the last letter from word to form a word in `vectors`.\"\"\"\n", " return words({word[:-1]}, vectors)\n", "\n", "def drop_any_letter(word, vectors) -> Set[str]: \n", " \"\"\"All ways to drop one letter from word to form a word in `vectors`.\"\"\"\n", " return words({word[:i] + word[i + 1:] for i in range(len(word))}, vectors)\n", "\n", "def words(candidates, vectors) -> Set[str]:\n", " \"\"\"All candidate strings that are words in the `vectors` dict.\"\"\"\n", " return {w for w in candidates if w in vectors}" ] }, { "cell_type": "code", "execution_count": 8, "id": "6d82b297", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'clam'}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "drop_last_letter('clamp', vectors)" ] }, { "cell_type": "code", "execution_count": 9, "id": "1e4c8df5", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'camp', 'clam', 'clap', 'lamp'}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "drop_any_letter('clamp', vectors)" ] }, { "cell_type": "markdown", "id": "5f9cfe4c", "metadata": {}, "source": [ "We're ready to generate a sorted list of word pairs:" ] }, { "cell_type": "code", "execution_count": 10, "id": "0b3e1834", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1129" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs = sorted_pairs(vectors, drop_last_letter)\n", "len(pairs)" ] }, { "cell_type": "markdown", "id": "21903917", "metadata": {}, "source": [ "There are 1,129 pairs, but I'll just look at the first 50:" ] }, { "cell_type": "code", "execution_count": 11, "id": "c55e51a6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('seedy', 'seed'),\n", " ('depth', 'dept'),\n", " ('hindu', 'hind'),\n", " ('sloth', 'slot'),\n", " ('plumb', 'plum'),\n", " ('tense', 'tens'),\n", " ('reverb', 'rever'),\n", " ('irish', 'iris'),\n", " ('siren', 'sire'),\n", " ('trusty', 'trust'),\n", " ('meter', 'mete'),\n", " ('pleat', 'plea'),\n", " ('sinew', 'sine'),\n", " ('chancel', 'chance'),\n", " ('heath', 'heat'),\n", " ('aspiring', 'aspirin'),\n", " ('forth', 'fort'),\n", " ('combo', 'comb'),\n", " ('drama', 'dram'),\n", " ('paste', 'past'),\n", " ('outwith', 'outwit'),\n", " ('filly', 'fill'),\n", " ('board', 'boar'),\n", " ('livery', 'liver'),\n", " ('forcep', 'force'),\n", " ('stocky', 'stock'),\n", " ('corporal', 'corpora'),\n", " ('photon', 'photo'),\n", " ('forte', 'fort'),\n", " ('rabbit', 'rabbi'),\n", " ('median', 'media'),\n", " ('cello', 'cell'),\n", " ('chart', 'char'),\n", " ('spark', 'spar'),\n", " ('liver', 'live'),\n", " ('caster', 'caste'),\n", " ('irony', 'iron'),\n", " ('insider', 'inside'),\n", " ('heron', 'hero'),\n", " ('macho', 'mach'),\n", " ('heroine', 'heroin'),\n", " ('polyp', 'poly'),\n", " ('gravely', 'gravel'),\n", " ('primer', 'prime'),\n", " ('honey', 'hone'),\n", " ('quaker', 'quake'),\n", " ('tablet', 'table'),\n", " ('grant', 'gran'),\n", " ('prime', 'prim'),\n", " ('valet', 'vale')]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs[:50]" ] }, { "cell_type": "markdown", "id": "3e87f893", "metadata": {}, "source": [ "Now I'll consider the variant where we can drop any letter:" ] }, { "cell_type": "code", "execution_count": 12, "id": "09dfc2fa", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('mitre', 'mite'),\n", " ('seedy', 'seed'),\n", " ('posit', 'post'),\n", " ('resign', 'resin'),\n", " ('insect', 'inset'),\n", " ('score', 'core'),\n", " ('parse', 'arse'),\n", " ('depth', 'dept'),\n", " ('convert', 'covert'),\n", " ('thank', 'tank'),\n", " ('hindu', 'hind'),\n", " ('orally', 'rally'),\n", " ('stigma', 'sigma'),\n", " ('naive', 'nave'),\n", " ('harmful', 'armful'),\n", " ('sloth', 'slot'),\n", " ('canyon', 'canon'),\n", " ('bassist', 'assist'),\n", " ('preach', 'peach'),\n", " ('launder', 'lander'),\n", " ('congenital', 'congenial'),\n", " ('supper', 'upper'),\n", " ('duress', 'dress'),\n", " ('usher', 'user'),\n", " ('crate', 'rate'),\n", " ('platitude', 'latitude'),\n", " ('ironic', 'ionic'),\n", " ('sever', 'seer'),\n", " ('quilt', 'quit'),\n", " ('timer', 'tier'),\n", " ('yeast', 'east'),\n", " ('sturdy', 'study'),\n", " ('tamper', 'taper'),\n", " ('crater', 'cater'),\n", " ('frame', 'fame'),\n", " ('broach', 'roach'),\n", " ('stage', 'sage'),\n", " ('leaver', 'lever'),\n", " ('restate', 'estate'),\n", " ('sluice', 'slice'),\n", " ('blinding', 'binding'),\n", " ('gamble', 'gable'),\n", " ('strait', 'trait'),\n", " ('pledge', 'ledge'),\n", " ('clean', 'clan'),\n", " ('shite', 'site'),\n", " ('sched', 'shed'),\n", " ('holist', 'hoist'),\n", " ('plumb', 'plum'),\n", " ('waive', 'wave')]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted_pairs(vectors, drop_any_letter)[:50]" ] }, { "cell_type": "markdown", "id": "44a30715", "metadata": {}, "source": [ "Both lists provide pretty promising pairs! \n", "\n", "But did I really accomplish the goal of eliminating bad pairs? I think so! The pairs at the end of the list are exactly the kind of thing I wanted to get rid of:" ] }, { "cell_type": "code", "execution_count": 13, "id": "56cf7c96", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('northwards', 'northward'),\n", " ('decentralised', 'decentralise'),\n", " ('insured', 'insure'),\n", " ('thanks', 'thank'),\n", " ('surpluse', 'surplus'),\n", " ('eastwards', 'eastward'),\n", " ('devoted', 'devote'),\n", " ('opposed', 'oppose'),\n", " ('alleged', 'allege'),\n", " ('waken', 'wake'),\n", " ('diall', 'dial'),\n", " ('increased', 'increase'),\n", " ('randomised', 'randomise'),\n", " ('brewery', 'brewer'),\n", " ('involved', 'involve'),\n", " ('vaginal', 'vagina'),\n", " ('larval', 'larva'),\n", " ('fabliaux', 'fabliau'),\n", " ('towards', 'toward'),\n", " ('mucosal', 'mucosa')]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs[-20:]" ] }, { "cell_type": "markdown", "id": "5fa3c30e", "metadata": {}, "source": [ "# 3 Manually writing clues\n", "\n", "I can take some of these suggested pairs, and make up clues on my own:\n", "\n", "| pair | drop | clue |\n", "| :---- | :---- | :---- |\n", "|alley ally|any|**Bowling teammate**|\n", "|binding blinding|any|**Mandatory punishment for the cyclops Polyphemus**|\n", "|class lass|any|**Sophisticated young lady**|\n", "|finnish finish|any|**Scandinavian ending**|\n", "|harmful armful|any|**Dangerous bundle**|\n", "|latitude platitude|any|**Parallel cliche**|\n", "|plum plumb|last|**Most excellent toilet installation job**|" ] }, { "cell_type": "markdown", "id": "d502d513", "metadata": {}, "source": [ "# 3. Prompting an LLM to write clues\n", "\n", "I wrote some code to submit pairs of words to a large language model (LLM) and get back a clue. The basic approach looks like this:" ] }, { "cell_type": "code", "execution_count": 14, "id": "36a92c46", "metadata": {}, "outputs": [], "source": [ "from requests import post\n", "\n", "def get_a_clue(pair, url, api_key) -> str:\n", " \"\"\"Ask a LLM to generate a clue for the pair of words.\"\"\"\n", " phrase = ' '.join(pair)\n", " return ask(f'Write a clever crossword puzzle clue for the phrase \"{phrase}\"', url, api_key)\n", "\n", "def ask(query, url, api_key) -> str:\n", " \"\"\"Ask PaLM model to respond to the query.\"\"\"\n", " response = post(url, headers={\"Authorization\": f\"Bearer {api_key}\"}, json={\"query\": query})\n", " response.raise_for_status()\n", " return response.json()[\"queryResult\"][\"fulfillmentText\"]" ] }, { "cell_type": "markdown", "id": "1c4bbb58", "metadata": {}, "source": [ "However, I'm not going to run the above code in this notebook, for three reasons:\n", "1. Every time the code is re-run it produces a different response, and some runs are not very good.\n", "2. For security, I can't show the API key in this notebook, so you would need to get your own API key.\n", "3. There are many LLMs and I don't want to imply that this is the only one to use.\n", "\n", "Instead I'll show in the table below some of the previously-generated responses, edited for brevity, and selecting mostly good responses, with a few bad ones to illustrate specific points. The model gets a lot of things right:\n", "- Some of the clues are really good!\n", "- The model mostly understands how clues work: short, not too hard, not too easy.\n", "- The model knows that word play or puns are desireable.\n", "- The model gives an explanation of the clue as well as the clue itself.\n", "- Twice the model used a question mark, a crossword puzzle convention to signal that the clue is not straightforward.\n", "\n", "\n", "The model also gets some things wrong:\n", "- Some of the clues are really bad!\n", "- Many clues break a cardinal rule by mentioning one of the answer words (see \"board boar\", etc).\n", "- The model often claims that a clue is \"a play on\" two meanings of a word (so far so good), but then gets the words or meanings confused (see \"frame fame\", \"pleat plea\", \"stocky stock\").\n", "\n", "\n", "|pair|drop|clue|explanation|\n", "| :---- | :---- | :---- | :----------- |\n", "|aspiring aspirin|any|**Wannabe pain reliever**|The clue suggests that an \"aspiring aspirin\" is a something that wants to be a pain reliever, but is not yet one. |\n", "|blinding binding|any|**A dazzling cover-up**|The clue suggests that a \"blinding binding\" is a very bright or dazzling cover-up, such as the cover of a book with a shiny metallic finish or a book with a cover made of sequins.|\n", "|board boar|last|**A pig that's a member of the board of directors**|This clue is clever because it is a pun on the words \"board\" and \"boar\". It also plays on the stereotype of pigs being greedy and wanting to be in positions of power.|\n", "|cello cell|last|**Musical prison?**|This clue is clever because it is a pun on the word \"cell.\" A \"cell\" can refer to a room in a prison, or it can refer to the part of a cello that contains the strings. |\n", "|covert convert|any|**Agent of change**|This clue is clever because it is a play on words. \"Agent of change\" can refer to someone who is trying to make a difference in the world, but it can also refer to a spy or undercover agent.|\n", "|crater cater|any|**A space rock that serves food**|The word \"crater\" refers to a large bowl-shaped depression in the ground, often formed by the impact of a meteorite or asteroid.|\n", "|finnish finish|any|**How a Suomi might end a race**|The word \"Suomi\" is the Finnish word for Finland. The clue suggests how a Finnish person might end a race, because it is a play on the words \"Finnish\" and \"finish.\"|\n", "|frame fame|any|**What kind of fame is displayed on a wall?**|This clue is clever because it is a play on words. The word \"fame\" can mean both \"celebrity\" and \"a picture in a frame.\" |\n", "|flying fling|any|**Airy affair**|This clue is a bit of a pun, as it plays on the two meanings of the word \"fling\": a brief and passionate romantic relationship, and the act of throwing something. |\n", "|ledge pledge|any|**A promise to
hang on**|The clue suggests that a \"pledge ledge\" is a promise to hang on to something, such as a promise to keep a secret or to repay a debt.|\n", "|pleat plea|last|**Fashionable plea for more folds?**|This clue is clever because it is a pun on the two meanings of the word \"pleat\": (1) A fold in a piece of fabric (2) A request or appeal|\n", "|plum plumb|last|**Perfectly perpendicular**|The clue suggests that a \"plum plumb\" is a line that is perfectly perpendicular to the ground, like a plumb line.|\n", "|quaker quake|any|**A religious tremor**|The clue suggests that a \"quaker quake\" is a type of religious tremor, which could be interpreted in a number of ways. For example, it could refer to a powerful spiritual experience, or to a time of social and political upheaval.|\n", "|quilt quit|any|**Sewing machine strike**|The clue suggests that a \"quilt quit\" is a strike by sewing machines, which would prevent people from making quilts.|\n", "|rabbit rabbi|last|**What kind of religious leader is all ears**|This clue is clever because it plays on the double meaning of the word \"ears.\" On the one hand, rabbits are known for having large ears. On the other hand, \"all ears\" is an idiom that means to be very attentive and eager to listen.|\n", "|rally orally|any|**Shout for support**|The answer is \"rally orally\", which is a phrase that could be used to describe someone shouting for support.|\n", "|seedy seed|last|**The kernel of a shady deal**|The clue suggests that a \"seedy seed\" is the beginning of something shady or immoral. This could refer to a criminal activity, a broken promise, or even a relationship that is doomed.|\n", "|stage sage|any|**wise actor**|The clue suggests that a \"stage sage\" is a wise actor who is at the top of their game.|\n", "|stocky stock|last|**Beefy broth**|This clue is a play on the two meanings of the word \"stock\": (1) A type of broth made from meat or bones. (2) A large quantity of something, especially goods or money.|\n", "|timer tier|any|**A level of timekeeping**|The clue suggests that a \"timer tier\" is a level of timekeeping, such as the different tiers of accuracy that are available for timers. For example, a timer used for cooking may have a lower tier of accuracy than a timer used for scientific research.|\n", "|yeast east|any|**Where the dough rises**|The clue suggests that \"yeast east\" is the direction in which the sun rises because yeast causes dough to rise.|\n", "\n", "What do you think? Are these clues any good? Can you do better, either by manually writing your own, or by writing a better program (e.g. by finding better word pairs, by constructing a better prompt, or by using a different LLM that performs better on this task)?" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" } }, "nbformat": 4, "nbformat_minor": 5 }