{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Speculative magic words workbook\n", "\n", "By [Allison Parrish](http://www.decontextualize.com/)\n", "\n", "(Early draft, incomplete, under construction gif here)\n", "\n", "The goal of this notebook is to demonstrate some computational means for exploring the literary genre of the *magic word*. For present purposes, I define a \"magic word\" as a string of letters that affords a foregrounding of its material properties (e.g., spelling, pronunciation), and suggests some effect beyond meaning alone. The underlying assumption (maybe faulty) is that magic words with similar material properties will also have similar effects, and that by writing computer programs to produce magic words (whether from whole cloth or as variants on other magic words), we can produce *new* magic words with *new* effects.\n", "\n", "I don't understand this notebook as a way of *casting* spells, but merely as a way of investigating potential forms. Hence: *speculative* magic words.\n", "\n", "The notebook serves as a demonstration of (1) Python string manipulation techniques; and (2) the Pincelate library for grapheme-to-phoneme and phoneme-to-grapheme translation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preliminaries\n", "\n", "Some of these examples will be data-driven, i.e., we need an existing corpus of words. [Download this file](https://github.com/dariusk/corpora/blob/master/data/words/nouns.json) into the same folder as this notebook like so:" ] }, { "cell_type": "code", "execution_count": 332, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "100 18192 100 18192 0 0 94259 0 --:--:-- --:--:-- --:--:-- 94750\n" ] } ], "source": [ "!curl -L -O https://raw.githubusercontent.com/dariusk/corpora/master/data/words/nouns.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The file contains a list of English nouns. The code in the cell below reads them into a list. We'll use this list throughout in the code below." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import json\n", "nouns = [item.lower() for item in json.load(open(\"nouns.json\"))['nouns']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `random` module has a function `choice` that picks one item from a list at random:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import random" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'mediator'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "random.choice(nouns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Orthographic variations\n", "\n", "> \"[W]riting gave physical permanence to words.... Written words continued to act in one's behalf long after the sound of spoken words had ceased\" (Skemer 133)\n", "\n", "> \"Motion terminates at no other end save its own beginning, in order to cease and rest in it... In the intelligible world... Grammar begins with the letter, from which all writing is derived and into which it is all resolved\" (John Scotus Erigena, quoted in Leggott 46)\n", "\n", "> \"[T]he unit of textual meaning—the letter—lacks meaning itself. The alphabet's semantic vacuum represents a threat to orthodoxy, for into this space competing meaning systems may rush.\" (Crain 18)\n", "\n", "The words in many apotropaic charms exhibit certain kinds of manipulation that we can characterize as *orthographic* in nature—i.e., they have to do with the letters in the words. In this section of the notebook, I show some computer code for performing these transformations explicitly.\n", "\n", "The following cell defines a short text that we'll use for testing purposes:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "text = \"in the beginning was the notebook\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cacography\n", "\n", "> \"In medieval manuscripts, the letters themselves were frequently a source of confusion. [...] The first letters of words can be omitted... while others are doubled up.... Words can be dislocated,\" \"compounded,\" \"contracted,\" \"abbreviated\"; \"letters vanish. [...] [W]e should also mention the variations made with uppercase and lowercase letters.... May this overview give the reader a small idea of the difficulties encountered by the researcher!\" (Lecouteux xxi)\n", "\n", "\"Cacography\" here means writing with mistakes. In medieval grimoires, mistakes were usually introduced as errors in copying, but the presence of errors actually made people perceive the spells as more powerful. We can simulate these errors in Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Compounding/contracting words\n", "\n", "This operation \"contracts\" two words, smooshing together the first and last parts." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "noun1 = random.choice(nouns)\n", "noun2 = random.choice(nouns)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "spoiler intercession\n" ] } ], "source": [ "print(noun1, noun2)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'spoession'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "noun1[:int(len(noun1)/2)] + noun2[int(len(noun2)/2):]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In function form:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'allrish'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def smoosh(a, b):\n", " return a[:int(len(a)/2)] + b[int(len(b)/2):]\n", "smoosh(\"allison\", \"parrish\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dislocation\n", "\n", "This operation inserts random spaces, dislocating words from each other." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "in the beginning was the notebo ok\n" ] } ], "source": [ "out = \"\"\n", "for ch in text:\n", " if random.random() < 0.1:\n", " out += \" \"\n", " out += ch\n", "print(out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a function:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'ab rac adabra'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def dislocate(s, prob=0.1):\n", " out = \"\"\n", " for ch in s:\n", " if random.random() < prob:\n", " out += \" \"\n", " out += ch\n", " return out\n", "dislocate(\"abracadabra\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' a b r ac ad a b r a'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dislocate(\"abracadabra\", 0.75)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Coding, transliteration, encryption\n", "\n", "Another strategy for producing magic words is transliterating them (e.g., converting Greek letters to their Roman equivalent) or applying ciphers (like a [substitution cipher](https://en.wikipedia.org/wiki/Substitution_cipher), in which each letter is replaced with another letter). These techniques retain the underlying *structure* of the spelling, so the resulting form doesn't look entirely random. But it doesn't retain the surface form—it makes the familiar unfamiliar.\n", "\n", "#### Character ciphers\n", "\n", "The function below implements simple character replacement:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "def replace_by_char(s, ch_map):\n", " out = \"\"\n", " for ch in s:\n", " if ch in ch_map:\n", " out += ch_map[ch]\n", " else:\n", " out += ch\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You need to give the function a dictionary that maps any letter expected in the input to a corresponding letter to output. This dictionary maps each letter to the letter that follows it in the alphabet:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "nextch_map = {\n", " 'a': 'b', 'b': 'c', 'c': 'd', 'd': 'e',\n", " 'e': 'f', 'f': 'g', 'g': 'h', 'h': 'i',\n", " 'i': 'j', 'j': 'k', 'k': 'l', 'l': 'm',\n", " 'm': 'n', 'n': 'o', 'o': 'p', 'p': 'q',\n", " 'q': 'r', 'r': 's', 's': 't', 't': 'u',\n", " 'u': 'v', 'v': 'w', 'w': 'x', 'x': 'y',\n", " 'y': 'z', 'z': 'a'\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Call it on a string:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'bmmjtpo qbssjti'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "replace_by_char(\"allison parrish\", nextch_map)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A well-known cipher in computer programming culture is [rot13](https://en.wikipedia.org/wiki/ROT13), in which each character is replaced with the character that comes thirteen spots later in the alphabet (wrapping around the end of the alphabet as needed). It's so common, it's already implemented in Python:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'nyyvfba cneevfu'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import codecs\n", "codecs.encode(\"allison parrish\", 'rot13')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Mirror writing\n", "\n", "> \"According to legend, some devil-pacts were written in retrograde to invoke diabolical powers. [...] Artists depicted retrograde writing as demonic. In a 15th c. block book, a demon is shown holding up a tablet on which the sins of the dying man's life are recorded in mirror writing...\" (Skemer 121)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# from https://github.com/combatwombat/Lunicode.js/blob/master/lunicode.js\n", "mirror_replacements = {\n", " 'a': 'ɒ', 'b': 'd', 'c': 'ɔ', 'd': 'b', 'e': 'ɘ', \n", " 'f': 'Ꮈ', 'g': 'ǫ', 'h': 'ʜ', 'i': 'i', 'j': 'ꞁ',\n", " 'k': 'ʞ', 'l': 'l', 'm': 'm', 'n': 'ᴎ', 'o': 'o',\n", " 'p': 'q', 'q': 'p', 'r': 'ɿ', 's': 'ꙅ', 't': 'ƚ',\n", " 'u': 'u', 'v': 'v', 'w': 'w', 'x': 'x', 'y': 'ʏ', 'z': 'ƹ',\n", " 'A': 'A', 'B': 'ᙠ', 'C': 'Ɔ', 'D': 'ᗡ', 'E': 'Ǝ',\n", " 'F': 'ꟻ', 'G': 'Ꭾ', 'H': 'H', 'I': 'I', 'J': 'Ⴑ',\n", " 'K': '⋊', 'L': '⅃', 'M': 'M', 'N': 'Ͷ', 'O': 'O',\n", " 'P': 'ꟼ', 'Q': 'Ọ', 'R': 'Я', 'S': 'Ꙅ', 'T': 'T',\n", " 'U': 'U', 'V': 'V', 'W': 'W', 'X': 'X', 'Y': 'Y', 'Z': 'Ƹ'}" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "in the beginning was the notebook iᴎ ƚʜɘ dɘǫiᴎᴎiᴎǫ wɒꙅ ƚʜɘ ᴎoƚɘdooʞ\n" ] } ], "source": [ "print(text + \" \" + replace_by_char(text, mirror_replacements))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Mimicking handwriting mistakes and misinterpretations\n", "\n", "Magic words gain power from being copied over and over; mistakes creep in that make the words strange. Lecouteux (p. xxi) suggests that the following accidental replacements were common in medieval manuscripts written in Roman scripts:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# suggested in Lecouteux, p. xxi\n", "replacements = {\n", " 'u': ['o', 'n'],\n", " 'st': ['h'],\n", " 'p': ['f'],\n", " 'ni': ['m'],\n", " 'rn': ['m'],\n", " 'in': ['m'],\n", " 'iu': ['m', 'in'],\n", " 'r': ['t', 'z', 'c'],\n", " 'l': ['t'],\n", " 'c': ['t'],\n", " 'd': ['ol']\n", "}" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "import re\n", "import random" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These replacements have to be implemented a bit differently from the character substitution ciphers, because the patterns on the left have varying numbers of characters. So we can't just step straight through the source string character by character. The following code replaces every instance of sequences of characters on the left (dictionary keys) at random from the suggested replacements on the right (dictionary values), if a coin flip succeeds." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "in the beginning was the notebook\n", "in the begmmng was the notebook\n" ] } ], "source": [ "out = text\n", "for patt, repl in replacements.items():\n", " out = re.sub(patt,\n", " lambda m: random.choice(repl) if random.random() < 0.5 else m.group(),\n", " out)\n", "print(text)\n", "print(out)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Abbreviations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> [In magic spells] \"we find sequences of letters that can be the initials of words. [...] A passage from the *Gesta Imperatorum* suggests this; in fact we read there the sequence \"P P P, S S S, R R R, F F F,\" meaning, \"Pater patriae perditur, sapientia secum sustollitur, ruunt regna Rome ferro, flamma, fame.\" The series of letters would therefore be a mnemonic means used to retain whole phrases, but in charms it also serves as a way to keep things secret...\" (Lecouteux xx)\n", "\n", "The following function takes a string and returns the first *n* characters of each word in the string (as a list)." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['h', 't', 'h', 'a', 'y']" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def abbrev(s, take=1):\n", " words = s.split()\n", " return [w[:take] for w in words]\n", "abbrev(\"hello there how are you?\")" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['in', 'th', 'be', 'wa', 'th', 'no']" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "abbrev(text, 2)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "inthbewathno\n" ] } ], "source": [ "print(''.join(abbrev(text, 2)))" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "In. Th. Be. Wa. Th. No\n" ] } ], "source": [ "init_cap = [item.capitalize() for item in abbrev(text, 2)]\n", "print('. '.join(init_cap))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Formatting\n", "\n", "According to Skemer, magic words and formulas such as *abracadabra* and *abraxas* were \"often written as diminishing and augmenting series of letters\"—shaped in \"inverted triangles\" or \"[mandorlas](https://en.wikipedia.org/wiki/Mandorla)\" (116).\n", "\n", "The following function implements a word triangle, in which the word is spelled out letter-by-letter, with each spelling on its own line (returned as a list). It's demonstrated here with a second call that reverses the order, creating a mandorla." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a\n", "ab\n", "abr\n", "abra\n", "abrac\n", "abraca\n", "abracad\n", "abracada\n", "abracadab\n", "abracadabr\n", "abracadabra\n", "abracadabra\n", "abracadabr\n", "abracadab\n", "abracada\n", "abracad\n", "abraca\n", "abrac\n", "abra\n", "abr\n", "ab\n", "a\n" ] } ], "source": [ "def triangle(s):\n", " out = []\n", " for i in range(len(s)):\n", " snippet = s[:i+1]\n", " out.append(snippet)\n", " return out\n", "print(\"\\n\".join(triangle(\"abracadabra\")))\n", "print(\"\\n\".join(reversed(triangle(\"abracadabra\"))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `mandorla` function performs both steps:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "def mandorla(s):\n", " return triangle(s)[:-1] + list(reversed(triangle(s)))" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a\n", "ab\n", "abr\n", "abra\n", "abrac\n", "abraca\n", "abracad\n", "abracada\n", "abracadab\n", "abracadabr\n", "abracadabra\n", "abracadabr\n", "abracadab\n", "abracada\n", "abracad\n", "abraca\n", "abrac\n", "abra\n", "abr\n", "ab\n", "a\n" ] } ], "source": [ "print(\"\\n\".join(mandorla(\"abracadabra\")))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Jupyter Notebook displays text in a fixed-width font by default, so centering doesn't work very well. Instead, we'll write the lines out as HTML and display with Jupyter Notebook's HTML widget:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "from IPython.display import display, HTML" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "html_src = \"