# Tracery and Python

by [Allison Parrish](http://www.decontextualize.com/)

This is a tutorial on how to use [Tracery](http://www.crystalcodepalace.com/tracery.html) in Python.
Tracery is a computer language for random text generation originally developed
by [Kate Compton](https://twitter.com/galaxykate).

The easiest way to use Tracery in Python is to install [pytracery](https://github.com/aparrish/pytracery), my Python port of Kate's original code. You can install it on the command line with `pip`:

 $ pip install tracery

If you're running this in Jupyter Notebook, you can execute the cell below:

In [1]:
import sys
!{sys.executable} -m pip install tracery



Then you need to import the `tracery` library:

In [2]:
import tracery

You don't need Python to use Tracery! [Here's a version of this tutorial](http://air.decontextualize.com/tracery/) that you can use with any implementation of Tracery. I recommend Beau Gunderson's [Tracery writer](https://beaugunderson.com/tracery-writer/) as a kind of Tracery playground. You can also use Kate Compton's
[Tracery tutorial](http://www.crystalcodepalace.com/traceryTut.html), which has a [visual editor](http://www.brightspiral.com/tracery/) or [Cheap Bots Done Quick](http://cheapbotsdonequick.com/), which has a built-in editor for writing Tracery grammars for Twitter bots with a minimum amount of fuss.

You might be interested in reading [Nora Reed's explanation of how
@nerdgarbagebot works](http://barrl.net/2801), which takes you through the
process of ideating and implementing a Tracery grammar for a Twitter bot. (Nora
Reed makes a lot of amazing bots with Tracery, including
[@thinkpiecebot](https://twitter.com/thinkpiecebot).)

## Rules and expansions

A Tracery *grammar* is a series of rules that tell the computer how to put text
together, piece by piece. Tracery grammars consist of a series of *rules* and
*expansions*. The goal of writing a Tracery grammar is to write rules and
expansions that, when followed by the computer, produce interesting (funny,
insightful, poetic) text. The word for generating a text from a grammar is
"expand"---we'll be talking a lot below about "expanding" the grammar into a
text. (Hopefully the reasons for using this word will become clear!)

In Python, Tracery rules and expansions are written as dictionaries, where the rules are keys and the expansions are values. Here's an example of a complete, but very boring, Tracery grammar:

In [3]:
rules = {
 "origin": "Hello, world!"
}

To generate text from this grammar, first create a Tracery `Grammar` object like so, passing the rules as the only parameter:

In [4]:
grammar = tracery.Grammar(rules)

Then call the `.flatten()` method of the `Grammar` object with `"#origin#"` as the only parameter. (I'll talk about what `#origin#` means in a second.)

In [5]:
grammar.flatten("#origin#")

'Hello, world!'

This grammar can produce only one text: `Hello, world!`. Not very interesting,
but helpful for the moment to illustrate how a grammar is put together and how to make it produce some output.

Here's a Tracery grammar with two rules, written again as a dictionary, where each rule and its expansion are key/value pairs:

In [6]:
rules = {
 "origin": "Hello, #noun#!",
 "noun": "galaxy"
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

'Hello, galaxy!'

This grammar, again, can only ever produce one text: `Hello, galaxy!` But it
accomplishes it in a slightly more sophisticated way. Notice in the expansion
for the `origin` rule the following text:

 #noun#

When the Tracery generator encounters text that looks like this---a word surrounded by
`#` signs---it looks in the grammar for a rule with the same name as the word,
and *replaces* the text with the expansion for that rule.

Let's add a third rule to this grammar, just to see how it looks:

In [7]:
rules = {
 "origin": "#greeting#, #noun#!",
 "greeting": "Howdy",
 "noun": "galaxy"
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

'Howdy, galaxy!'

> EXERCISE: Add another rule for the punctuation at the end of the sentence, so
> that the grammar produces the text "Howdy, galaxy?"

## Adding alternatives

The examples above are really boring, because they can only ever produce one
output. In order for a grammar to be able to produce different outputs, we need
to make the expansions of our rules have *alternatives* for the computer to
choose between. Rules with alternatives look like this:

 "rule": ["alternative one", "alternative two", "alternative three"]

That is: the value of the rule is a *list* of strings (instead of an individual string). When Tracery expands a rule whose value is a list, it will select one item from the list at random.

Here's our "Hello, world!" grammar, now with multiple alternatives for what
we're greeting:

In [8]:
rules = {
 "origin": "#greeting#, #noun#!",
 "greeting": "Howdy",
 "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

'Howdy, galaxy!'

Run the cell over and over again and you'll see different outputs. (Sometimes
it'll look like it isn't working, but that's just because the computer randomly
selected the same alternative twice in a row. It can happen!)

Let's make this "Hello, world!" example *even more interesting* by adding
alternatives for the `greeting` rule:

In [9]:
rules = {
 "origin": "#greeting#, #noun#!",
 "greeting": ["Howdy", "Hello", "Greetings", "What's up", "Hey", "Hi"],
 "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

'Hello, local cluster!'

Sometimes for debugging purposes, it's nice to generate multiple outputs from the same grammar in one cell execution. To do this, `print()` the value of the `.flatten()` function in a `for` loop. (You don't have to re-create the `Grammar` object each time.)

In [10]:
rules = {
 "origin": "#greeting#, #noun#!",
 "greeting": ["Howdy", "Hello", "Greetings", "What's up", "Hey", "Hi"],
 "noun": ["world", "solar system", "galaxy", "local cluster", "universe"]
}
grammar = tracery.Grammar(rules)
for i in range(5):
 print(grammar.flatten("#origin#"))

What's up, world!
Greetings, solar system!
Hey, local cluster!
Howdy, local cluster!
What's up, universe!


Remember that in Python you can format dictionaries and lists with some flexibility. For example, your grammar might be a bit more readable if you write each option on a separate line:

In [11]:
rules = {
 "origin": "#greeting#, #noun#!",
 "greeting": [
 "Howdy",
 "Hello",
 "Greetings",
 "What's up",
 "Hey",
 "Hi"
 ],
 "noun": [
 "world",
 "solar system",
 "galaxy",
 "local cluster",
 "universe"
 ]
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

"What's up, local cluster!"

You don't always have to write the expansions as string literals and list literals. You can use a variable with a list assigned to it, for example—this is especially helpful if you have a long list of things that you plan to use in multiple grammars, or if you want to get the list of things from another source (e.g., a text file).

## Modifiers

Let's make a more sophisticated grammar that produces sentences in the format
"Dammit Jim, I'm a X, not a Y!" popularized by the ground-breaking science
fiction program, *Star Trek*. I happen to have a list of professions, which I'm going to put into a variable here. (I got this list from Darius Kazemi's [Corpora Project](https://github.com/dariusk/corpora/tree/master/data)—an excellent place to find lists of things. And they're already preformatted in a way that makes it easy to cut-and-paste them into your Tracery grammars.)

In [12]:
professions = [
 "accountant",
 "actor",
 "archeologist",
 "astronomer",
 "audiologist",
 "bartender",
 "curator",
 "detective",
 "economist",
 "editor",
 "engineer",
 "epidemiologist",
 "farmer",
 "flight attendant",
 "forest fire prevention specialist",
 "graphic designer",
 "hydrologist",
 "librarian",
 "mathematician",
 "middle school teacher",
 "nutritionist",
 "painter",
 "rancher",
 "referee",
 "reporter",
 "sailor",
 "sociologist",
 "stonemason",
 "surgeon",
 "tailor",
 "taxi driver",
 "teacher",
 "therapist",
 "tour guide",
 "umpire",
 "undertaker",
 "urban planner",
 "veterinarian",
 "web developer",
 "welder",
 "writer",
 "zoologist"
]

The grammar for generating our Star Trek phrase might look like this:

In [13]:
rules = {
 "origin": "#interjection#, #name#! I'm a #profession#, not a #profession#!",
 "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
 "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
 "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
 "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.flatten("#origin#")

"congratulations, George! I'm a graphic designer, not a audiologist!"

This is pretty good, but there are problems. The first is that we typed in all
of the interjections in lower case, but they're supposed to have the first
letter capitalized (since they're at the beginning of the sentence). The second
problem is that the grammar occasionally produces something like

 yes, George! I'm a economist, not a zoologist!

"A economist" isn't right. It should be "a*n* economist." English indefinite
articles are tricky that way!

There are several ways to solve these problems. We could just change all of our
interjections to be capitalized, and add the appropriate article to the
beginning of each profession. But (1) this will be time consuming and (2) it
means that we won't ever be able to *re-use* those same rules with the
unmodified versions of those rules. What to do?

Thankfully, Tracery comes equipped with a series of *modifiers* that take the expansion of a rule and apply a transformation to it. The modifiers are included with pytracery, but they're in a separate module, so you need to import them in their own import statement:

In [14]:
from tracery.modifiers import base_english

And then you have to explicitly "add" them to the `Grammar` object after you create it, like so:

 grammar = tracery.Grammar(rules)
 grammar.add_modifiers(base_english)

The two modifiers we're going to use are `.a`, which adds the appropriate indefinite article before the expansion of a rule, and `.capitalize`, which capitalizes the first letter of the expansion.

Use the modifers by adding `.a` inside the `#` signs, *right after* the name of the rule. For example, change:

 #interjection#

to

 #interjection.capitalize#

Here's our "Dammit Jim" generator with the modifiers in place:

In [15]:
rules = {
 "origin": "#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
 "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
 "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
 "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
 "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")

"Whoa, Kevin! I'm a flight attendant, not an editor!"

Nice! Another modifier you can use is `.s`, which turns the text in the expansion into its plural version. Using this, we can modify the above example to be a *Star Wars* meme instead of a *Star Trek* one:

In [16]:
rules = {
 "origin": "These aren't the #profession.s# we're looking for.",
 "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")

"These aren't the librarians we're looking for."

## The `origin` rule

By convention, the "starting" rule of Tracery grammars is called `origin`. A lot of tools that use Tracery grammars follow this convention, and for ease of interoperability it's probably best if you do too. But you can actually use any name you want, as long as you use that name in the call to `.flatten()`. For example, we could rewrite the above example like so:

In [17]:
rules = {
 "origin": "These aren't the #profession.s# we're looking for.",
 "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
grammar.flatten("#origin#")

"These aren't the referees we're looking for."

Like any other rule, the "starting" rule can have multiple options. We could use this to, for example, create a grammar that outputs Star Wars memes half the time and Star Trek memes the other half:

In [18]:
rules = {
 "origin": ["#interjection.capitalize#, #name#! I'm #profession.a#, not #profession.a#!",
 "These aren't the #profession.s# we're looking for."],
 "interjection": ["alas", "congratulations", "eureka", "fiddlesticks",
 "good grief", "hallelujah", "oops", "rats", "thanks", "whoa", "yes"],
 "name": ["Jim", "John", "Tom", "Steve", "Kevin", "Gary", "George", "Larry"],
 "profession": professions
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
for i in range(10):
 print(grammar.flatten("#origin#"))

These aren't the audiologists we're looking for.
Good grief, Larry! I'm a graphic designer, not a tour guide!
Alas, George! I'm an audiologist, not an audiologist!
These aren't the detectives we're looking for.
Hallelujah, George! I'm a tailor, not a mathematician!
These aren't the zoologists we're looking for.
These aren't the hydrologists we're looking for.
Eureka, Jim! I'm a graphic designer, not a rancher!
These aren't the zoologists we're looking for.
These aren't the bartenders we're looking for.


## Rules within rules within rules

The grammars we've written together so far have replacement syntax (`#somethinglikethis#`) only in the expansions for the `origin` rule. But you can include that syntax in any expansion you want! This is a powerful tool for building sophisticated grammars that are built up from reusable parts. For example, this tiny model of English reuses the `noun` and `verb` rules in multiple places, thereby preventing repetition and increasing expressiveness.

In [19]:
rules = {
 "origin": "#nounphrase.capitalize# #verbphrase#.",
 "nounphrase": ["the #noun#", "the #noun#", "#noun.a#", "#noun.a#", "the #noun# that #verbphrase#",
 "the #noun# #prep# #nounphrase#"],
 "verbphrase": ["#verb#", "#verb# #nounphrase#", "#verb# #prep# #nounphrase#"],
 "noun": ["amoeba", "dichotomy", "seagull", "trombone", "corsage", "restaurant", "suburb"],
 "verb": ["awakens", "bends", "burns", "closes", "expands", "fails", "fractures", "gathers",
 "melts", "opens", "ripens", "scatters", "stops", "sways", "turns", "unfurls", "worries"],
 "prep": ["in", "on", "over", "against"]
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
for i in range(10):
 print(grammar.flatten("#origin#"))

A dichotomy worries against a dichotomy.
An amoeba awakens the dichotomy that opens an amoeba.
The corsage that worries ripens on a trombone.
A trombone melts over an amoeba.
The restaurant that ripens turns.
The seagull on the seagull over the trombone on the trombone closes on the suburb on an amoeba.
The restaurant that gathers the corsage unfurls.
An amoeba bends a dichotomy.
A corsage fails on the seagull that worries on the amoeba.
The trombone that sways against a suburb fails.


## Loading alternatives from an external source

All of the examples we've looked so far in this notebook have used literal strings and lists as the expansions for rules (i.e., the values for keys in the grammar dictionary). If you're using Tracery in Python, there are some techniques you can use for loading expansions from external data sources instead. This is a good option if you have a large number of alternatives for a particular expansion.

Let's say that you have a text file which has one alternative per line, like [this list of adjectives from my plaintext example files repository](https://github.com/aparrish/plaintext-example-files/blob/master/adjs.txt). To use this, first download the file into the same directory as this notebook. Then execute the following cell to load the file in as a list, with one element per line in the file:

In [21]:
adjs = open("adjs.txt").read().split("\n")

Now you have an array of adjectives. Let's take a peek inside to make sure we've loaded the file correctly.

In [23]:
adjs[100:110]

['bolstered',
 'bonnie',
 'bored',
 'boundary',
 'bounded',
 'bounding',
 'branched',
 'brawling',
 'brazen',
 'breeding']

Having loaded in this list, you can now use it as the expansion for a rule. To do this, put the variable name of the list as the rule expansion in the grammar. Here, I've adapted the code from the grammar above to incorporate a new `adj` rule, whose expansion is the list of adjectives. I've also added references to the `adj` rule in various expansions for the `nounphrase` and `verbphrase` rules:

In [25]:
rules = {
 "origin": "#nounphrase.capitalize# #verbphrase#.",
 "nounphrase": ["the #noun#", "the #adj# #noun#", "#noun.a#", "#adj.a# #noun#", "the #noun# that #verbphrase#",
 "the #noun# #prep# #nounphrase#"],
 "verbphrase": ["#verb#", "#verb# #nounphrase#", "#verb# #prep# #nounphrase#", "is #adj#"],
 "noun": ["amoeba", "dichotomy", "seagull", "trombone", "corsage", "restaurant", "suburb"],
 "verb": ["awakens", "bends", "burns", "closes", "expands", "fails", "fractures", "gathers",
 "melts", "opens", "ripens", "scatters", "stops", "sways", "turns", "unfurls", "worries"],
 "prep": ["in", "on", "over", "against"],
 "adj": adjs
}
grammar = tracery.Grammar(rules)
grammar.add_modifiers(base_english)
for i in range(10):
 print(grammar.flatten("#origin#"))

The floral suburb is uninvited.
A hands-off seagull is arrested.
The seagull burns.
The dichotomy in the suburb that ripens the suburb that closes on the trombone on a dichotomy scatters on a layered seagull.
A mated amoeba melts a nonsense corsage.
A seagull awakens against a voluptuous corsage.
The trombone on the dichotomy burns.
A graven suburb burns the trombone over the corsage against the medical trombone.
An uncooperative amoeba awakens.
The trombone on the dichotomy against the seagull against the dichotomy against the intern corsage is robust.


To do this for other rules, copy the line of code above that loads in the adjectives, and change the filename to a different file with one entry per line. (Also make sure to make a different variable name!)

## Next steps

Congratulations, you now know the basics of writing a Tracery grammar and how to use them in Python.

Tracery has a number of features that we didn't go into here, including the
ability to *save* the output of a rule to be re-used later in the same
expansion. See [Kate Compton's tutorial](http://cheapbotsdonequick.com/)
for more information. You might be interested in [these advanced text generators](http://www.brightspiral.com/) that Kate Compton made with Tracery.

If you're a Javascript programmer and you want to incorporate Tracery into your
own projects, [the source code is available
here](https://github.com/galaxykate/tracery) (also available as [a Node module](https://github.com/v21/tracery)).