{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div style=\"text-align: right\">Peter Norvig, Feb 2017</div>\n",
    "\n",
    "# BASIC Interpreter\n",
    "\n",
    "[Years ago](http://norvig.com/lispy.html), I showed how to write an Interpreter for a dialect of Lisp. Some readers appreciated it, and some asked about an interpreter for a language that isn't just a bunch of parentheses. In 2014 I saw a [celebration](http://time.com/69316/basic/) of the 50th anniversary of the 1964 [Dartmouth BASIC](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) interpreter, and thought that I could show how to implement such an interpreter.  I never quite finished in 2014, but now it is 2017, I rediscovered this unfinished file, and completed it. For those of you unfamiliar with BASIC, here is a sample program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "program = '''\n",
    "10 REM POWER TABLE\n",
    "11 DATA 8, 4\n",
    "15 READ N0, P0\n",
    "20 PRINT \"N\",\n",
    "25 FOR P = 2 to P0\n",
    "30   PRINT \"N ^\" P,\n",
    "35 NEXT P\n",
    "40 PRINT \"SUM\"\n",
    "45 LET S = 0\n",
    "50 FOR N = 2 TO N0\n",
    "55   PRINT N,\n",
    "60   FOR P = 2 TO P0\n",
    "65     LET S = S + N ^ P\n",
    "70     PRINT N ^ P,\n",
    "75   NEXT P\n",
    "80   PRINT S\n",
    "85 NEXT N\n",
    "99 END\n",
    "'''"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Of course I don't have to build everything from scratch in assembly language, and I don't have to worry about every byte of storage, like [Kemeny](http://www.dartmouth.edu/basicfifty/basic.html), [Gates](http://www.pagetable.com/?p=774), and [Woz](http://www.retrothing.com/2008/07/restoring-wozs.html) did, so my job is much easier.  The interpreter consists of three phases: \n",
    "* **Tokenization**: breaking a text into a list of tokens, for example: `\"10 READ N\"` becomes `['10', 'READ', 'N']`.\n",
    "* **Parsing**: building a representation from the tokens: `Stmt(num=10, typ='READ', args=['N'])`.\n",
    "* **Execution**: follow the flow of the program and do what each statement says; in this case the `READ` statement\n",
    "has the effect of an assignment: `variables['N'] = data.popleft()`.\n",
    "\n",
    "\n",
    "\n",
    "# Tokenization\n",
    "\n",
    "One way to turn a line of text into a list of tokens is with the `findall` method of a regular expression that defines all the legal tokens:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import re \n",
    "\n",
    "tokenize = re.compile(r'''\n",
    "    \\d* \\.? \\d+ (?: E -? \\d+)?                     | # number \n",
    "    SIN|COS|TAN|ATN|EXP|ABS|LOG|SQR|RND|INT|FN[A-Z]| # functions\n",
    "    LET|READ|DATA|PRINT|GOTO|IF|FOR|NEXT|END       | # keywords\n",
    "    DEF|GOSUB|RETURN|DIM|REM|TO|THEN|STEP|STOP     | # keywords\n",
    "    [A-Z]\\d? | # variable names (letter + optional digit)\n",
    "    \".*?\"    | # labels (strings in double quotes)\n",
    "    <>|>=|<= | # multi-character relational operators\n",
    "    \\S         # any non-space single character ''', \n",
    "    re.VERBOSE).findall"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The only complicated part is the syntax for numbers: optional digits followed by an optional decimal point, some digits, and optionally a power of 10 marked by `\"E\"` and followed by an (optional) minus sign and some digits. \n",
    "Example usage of `tokenize`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['10', 'READ', 'N']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tokenize('10 READ N')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['100', 'PRINT', '\"SIN(X)^2 = \"', ',', 'SIN', '(', 'X', ')', '^', '2']"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tokenize('100 PRINT \"SIN(X)^2 = \", SIN(X) ^ 2')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That looks good. Note that  my tokens are just strings; it will be the parser's job, not the tokenizer's, to recognize that `'2'` is a number and  `'X'` is the name of a variable.  (In some interpreters, the tokenizer makes  distinctions like these.)\n",
    "\n",
    "There's one important complication: spaces don't matter in BASIC programs, so the following should all be equivalent:\n",
    "\n",
    "    10 GOTO 99\n",
    "    10GOTO99\n",
    "    10 GO TO 99\n",
    "    \n",
    "The problem is that  `tokenize` gets the last one wrong:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['10', 'G', 'O', 'TO', '99']"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tokenize('10 GO TO 99')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "My first thought was to remove all white space from the input. That would work for this example, but it would change the token `\"HELLO WORLD\"` to `\"HELLOWORLD\"`, which is wrong.  To remove spaces everywhere *except* between  double quotes, I can tokenize the line and join the tokens back together. Then I can re-tokenize to get the final list of tokens; I do that in my new function below called `tokenizer`. \n",
    "\n",
    "Once I have a list of tokens, I access them through this interface:\n",
    "* `peek()`: returns the next token in `tokens` (without changing `tokens`), or `None` if there are no more tokens.\n",
    "* `pop()`: removes and returns the next token. \n",
    "* `pop(`*string*`)`: removes and returns the next token if it is equal to the string; else return `None` and leave `tokens` unchanged.\n",
    "* `pop(`*predicate*`)`: remove and return the next token if *predicate*(*token*) is true; else return `None`, leave `tokens` alone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tokens = [] # Global variable to hold a list of tokens\n",
    "\n",
    "def tokenizer(line):\n",
    "    \"Return a list of the tokens on this line, handling spaces properly, and upper-casing.\"\n",
    "    line = ''.join(tokenize(line)) # Remove whitespace\n",
    "    return tokenize(line.upper())\n",
    "\n",
    "def peek(): \n",
    "    \"Return the first token in the global `tokens`, or None if we are at the end of the line.\"\n",
    "    return (tokens[0] if tokens else None)\n",
    "\n",
    "def pop(constraint=None):\n",
    "    \"\"\"Remove and return the first token in `tokens`, or return None if token fails constraint.\n",
    "    constraint can be None, a literal (e.g. pop('=')), or a predicate (e.g. pop(is_varname)).\"\"\"\n",
    "    top = peek()\n",
    "    if constraint is None or (top == constraint) or (callable(constraint) and constraint(top)):\n",
    "        return tokens.pop(0)\n",
    "    \n",
    "def remove_spaces(line): \n",
    "    \"Remove white space from line, except space inside double quotes.\"\n",
    "    return \n",
    "\n",
    "def lines(text): \n",
    "    \"A list of the non-empty lines in a text.\"\n",
    "    return [line for line in text.splitlines() if line]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(Note: if I expected program lines to contain many tokens, I would use a `deque` instead of a `list` of tokens.) \n",
    "\n",
    "We can test `tokenizer` and the related functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ok'"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def test_tokenizer():\n",
    "    global tokens\n",
    "    assert tokenizer('X-1') == ['X', '-', '1'] # Numbers don't have a leading minus sign, so this isn't ['X', '-1']\n",
    "    assert tokenizer('PRINT \"HELLO WORLD\"') == ['PRINT', '\"HELLO WORLD\"']\n",
    "    assert tokenizer('10 GOTO 99') == tokenizer('10GOTO99') == tokenizer('10 GO TO 99') == ['10', 'GOTO', '99']\n",
    "    assert (tokenizer('100 PRINT \"HELLO WORLD\", SIN(X) ^ 2') == \n",
    "            ['100', 'PRINT', '\"HELLO WORLD\"', ',', 'SIN', '(', 'X', ')', '^', '2'])\n",
    "    assert (tokenizer('100IFX1+123.4+E1-12.3E4 <> 1.2E-34*-12E34+1+\"HI\" THEN99') ==\n",
    "            ['100', 'IF', 'X1', '+', '123.4', '+', 'E1', '-', '12.3E4', '<>', \n",
    "             '1.2E-34', '*', '-', '12E34', '+', '1', '+', '\"HI\"', 'THEN', '99'])\n",
    "    assert lines('one line') == ['one line']\n",
    "    assert lines(program) == [\n",
    "     '10 REM POWER TABLE',\n",
    "     '11 DATA 8, 4',\n",
    "     '15 READ N0, P0',\n",
    "     '20 PRINT \"N\",',\n",
    "     '25 FOR P = 2 to P0',\n",
    "     '30   PRINT \"N ^\" P,',\n",
    "     '35 NEXT P',\n",
    "     '40 PRINT \"SUM\"',\n",
    "     '45 LET S = 0',\n",
    "     '50 FOR N = 2 TO N0',\n",
    "     '55   PRINT N,',\n",
    "     '60   FOR P = 2 TO P0',\n",
    "     '65     LET S = S + N ^ P',\n",
    "     '70     PRINT N ^ P,',\n",
    "     '75   NEXT P',\n",
    "     '80   PRINT S',\n",
    "     '85 NEXT N',\n",
    "     '99 END']\n",
    "    assert [tokenizer(line) for line in lines(program)] == [\n",
    "     ['10', 'REM', 'P', 'O', 'W', 'E', 'R', 'T', 'A', 'B', 'L', 'E'],\n",
    "     ['11', 'DATA', '8', ',', '4'],\n",
    "     ['15', 'READ', 'N0', ',', 'P0'],\n",
    "     ['20', 'PRINT', '\"N\"', ','],\n",
    "     ['25', 'FOR', 'P', '=', '2', 'TO', 'P0'],\n",
    "     ['30', 'PRINT', '\"N ^\"', 'P', ','],\n",
    "     ['35', 'NEXT', 'P'],\n",
    "     ['40', 'PRINT', '\"SUM\"'],\n",
    "     ['45', 'LET', 'S', '=', '0'],\n",
    "     ['50', 'FOR', 'N', '=', '2', 'TO', 'N0'],\n",
    "     ['55', 'PRINT', 'N', ','],\n",
    "     ['60', 'FOR', 'P', '=', '2', 'TO', 'P0'],\n",
    "     ['65', 'LET', 'S', '=', 'S', '+', 'N', '^', 'P'],\n",
    "     ['70', 'PRINT', 'N', '^', 'P', ','],\n",
    "     ['75', 'NEXT', 'P'],\n",
    "     ['80', 'PRINT', 'S'],\n",
    "     ['85', 'NEXT', 'N'],\n",
    "     ['99', 'END']]\n",
    "\n",
    "    tokens = tokenizer('10 GO TO 99') \n",
    "    assert peek() == '10'\n",
    "    assert pop()  == '10'\n",
    "    assert peek() == 'GOTO'\n",
    "    assert pop()  == 'GOTO'\n",
    "    assert peek() == '99'\n",
    "    assert pop(str.isalpha) == None    # '99' is not alphabetic\n",
    "    assert pop('98.6') == None         # '99' is not '98.6'\n",
    "    assert peek() == '99'\n",
    "    assert pop(str.isnumeric)  == '99' # '99' is numeric\n",
    "    assert peek() is None and not tokens \n",
    "    \n",
    "    return 'ok'\n",
    "    \n",
    "test_tokenizer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['10 REM POWER TABLE',\n",
       " '11 DATA 8, 4',\n",
       " '15 READ N0, P0',\n",
       " '20 PRINT \"N\",',\n",
       " '25 FOR P = 2 to P0',\n",
       " '30   PRINT \"N ^\" P,',\n",
       " '35 NEXT P',\n",
       " '40 PRINT \"SUM\"',\n",
       " '45 LET S = 0',\n",
       " '50 FOR N = 2 TO N0',\n",
       " '55   PRINT N,',\n",
       " '60   FOR P = 2 TO P0',\n",
       " '65     LET S = S + N ^ P',\n",
       " '70     PRINT N ^ P,',\n",
       " '75   NEXT P',\n",
       " '80   PRINT S',\n",
       " '85 NEXT N',\n",
       " '99 END']"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lines(program)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: Grammar of Statements\n",
    "\n",
    "Parsing is the process of transforming the text of a program into an internal representation, which can then be executed.\n",
    "For BASIC, the representation will be an ordered list of statements, and we'll need various data types to represent the parts of the statements.\n",
    "I'll start by showing the grammar of BASIC statements, as seen on pages 56-57 of [the manual](http://web.archive.org/web/20120716185629/http://www.bitsavers.org/pdf/dartmouth/BASIC_Oct64.pdf) (see also pages 26-30 for a simpler introduction). A statement starts with a line number, and then can be one of the following 15 types of statements, each \n",
    "type introduced by a distinct keyword:\n",
    "\n",
    "- **`LET`**  `<variable>` **=** `<expression>`\n",
    "- **`READ`** `<list of variable>`\n",
    "- **`DATA`** `<list of number>`\n",
    "- **`PRINT`** `<sequence of label and expression>`\n",
    "- **`GOTO`** `<linenumber>`\n",
    "- **`IF`** `<expression> <relational> <expression>` **`THEN`** `<linenumber>`\n",
    "- **`FOR`** `<varname>` **=** `<expression>` **`TO`** `<expression> [`**`STEP`** `<expression>]`\n",
    "- **`NEXT`** `<varname>`\n",
    "- **`END`**\n",
    "- **`STOP`**\n",
    "- **`DEF`** `<funcname>`**(**`<varname>`**) = **`<expression>`\n",
    "- **`GOSUB`** `<linenumber>`\n",
    "- **`RETURN`**\n",
    "- **`DIM`** `<list of variable>`\n",
    "- **`REM`** `<any string of characters whatsoever>`\n",
    "              \n",
    "The notation `<variable>` means any variable and `<list of variable>` means zero or more variables, separated by commas.  `[`**`STEP`** `<expression>]` means that the literal string `\"STEP\"` followed by an expression is optional. \n",
    "\n",
    "Rather than use one of the many [language parsing frameworks](https://wiki.python.org/moin/LanguageParsing), I will show how to build a parser from scratch. First I'll translate the grammar above into  Python. Not character-for-character (because it would take a lot of work to get Python to understand how to handle those characters), but almost word-for-word (because I can envision a straightforward way to get Python to handle the following format):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def Grammar(): \n",
    "    return {\n",
    "    'LET':    [variable, '=', expression],\n",
    "    'READ':   [list_of(variable)],\n",
    "    'DATA':   [list_of(number)],\n",
    "    'PRINT':  [labels_and_expressions],\n",
    "    'GOTO':   [linenumber],\n",
    "    'IF':     [expression, relational, expression, 'THEN', linenumber],\n",
    "    'FOR':    [varname, '=', expression, 'TO', expression, step],\n",
    "    'NEXT':   [varname],\n",
    "    'END':    [],\n",
    "    'STOP':   [],\n",
    "    'DEF':    [funcname, '(', varname, ')', '=', expression],\n",
    "    'GOSUB':  [linenumber],\n",
    "    'RETURN': [],\n",
    "    'DIM':    [list_of(variable)], \n",
    "    'REM':    [anycharacters],\n",
    "    'A':    []\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing Strategy\n",
    "\n",
    "The grammar of BASIC is designed so that at every point, the next token tells us unambiguously how to parse. For example, the first token after the line number defines the type of statement; also, in an expression we know that all three-letter names are functions while all 1-letter names are variables. So in writing the various grammatical category functions, a common pattern is to either `peek()` at the next token or try a `pop(`*constraint*`)`, and from that decide what to parse next, and never have to back up or undo a `pop()`. Here is my strategy for parsing statements:\n",
    "\n",
    "* The  grammatical categories, like `variable` and `expression` (and also `statement`), will be defined as functions\n",
    "(with no argument) that pop tokens from the global variable `tokens`, and return a data object. For example, calling `linenumber()` will pop a token, convert it to an `int`, and return that. \n",
    "\n",
    "* Consider parsing the statement `\"20 LET X = X + 1\"`. \n",
    "\n",
    "* First tokenize to get: `tokens = ['20', 'LET', 'X', '=', 'X', '+', '1']`.\n",
    "\n",
    "* Then call `statement()` (defined below).\n",
    "\n",
    "  * `statement` first  calls `linenumber()`, getting back the integer `20` (and removing `'20'` from `tokens`).\n",
    "\n",
    "  * Then it calls `pop()` to get  `'LET'` (and removing `'LET'` from `tokens`).\n",
    "  * Then it indexes into the grammar with `'LET'`, retrieving the grammar rule `[variable, '=', expression]`.\n",
    "\n",
    "  * Then it processes the 3 constituents of the grammar rule:\n",
    "    * First, call `variable()`, which removes and returns `'X'`.\n",
    "    * Second, call `pop('=')`, which removes `'='` from `tokens`, and discard it.\n",
    "    * Third, call `expression()`, which returns a representation of `X + 1`; let's write that as `Opcall('X', '+', 1.0)`.\n",
    "\n",
    "  * Finally, `statement` assembles the pieces and returns `Stmt(num=20, typ='LET', args=['X', Opcall('X', '+', 1.0)])`.\n",
    "* If anything goes wrong, call `fail(\"`*error message*`\")`, which raises an error.\n",
    "\n",
    "Here is the definition of `statement`:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def statement():\n",
    "    \"Parse a BASIC statement from `tokens`.\"\n",
    "    num  = linenumber()\n",
    "    typ  = pop(is_stmt_type) or fail('unknown statement type')\n",
    "    args = []\n",
    "    for p in grammar[typ]: # For each part of rule, call if callable or match if literal string\n",
    "        if callable(p):\n",
    "            args.append(p())\n",
    "        else:\n",
    "            pop(p) or fail('expected ' + repr(p))\n",
    "    return Stmt(num, typ, args)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some of the grammatical categories, like `expression`, are complex. But many of the categories are easy one-liners:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def number():        return (-1 if pop('-') else +1) * float(pop()) # Optional minus sign\n",
    "def step():          return (expression() if pop('STEP') else 1)    # 1 is the default step\n",
    "def linenumber():    return (int(pop()) if peek().isnumeric() else fail('missing line number'))\n",
    "def relational():    return pop(is_relational) or fail('expected a relational operator')\n",
    "def varname():       return pop(is_varname)    or fail('expected a variable name')\n",
    "def funcname():      return pop(is_funcname)   or fail('expected a function name')\n",
    "def anycharacters(): tokens.clear() # Ignore tokens in a REM statement"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the predicates that distinguish different types of tokens:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def is_stmt_type(x):  return is_str(x) and x in grammar # LET, READ, ...\n",
    "def is_funcname(x):   return is_str(x) and len(x) == 3 and x.isalpha()  # SIN, COS, FNA, FNB, ...\n",
    "def is_varname(x):    return is_str(x) and len(x) in (1, 2) and x[0].isalpha() # A, A1, A2, B, ...\n",
    "def is_label(x):      return is_str(x) and x.startswith('\"') # \"HELLO WORLD\", ...\n",
    "def is_relational(x): return is_str(x) and x in ('<', '=', '>', '<=', '<>', '>=')\n",
    "def is_number(x):     return is_str(x) and x and x[0] in '.0123456789' # '3', '.14', ...\n",
    "\n",
    "def is_str(x):        return isinstance(x, str)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that `varname` means an unsubscripted variable name (a letter  by itself, like `X`, or followed by a digit, like `X3`), and that `variable` is a `varname` optionally followed by index expressions in parentheses, like `A(I)` or `M(2*I, 3)`: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def variable(): \n",
    "    \"Parse a possibly subscripted variable e.g. 'X3' or 'A(I)' or 'M(2*I, 3)'.\"\n",
    "    V = varname()\n",
    "    if pop('('):\n",
    "        indexes = list_of(expression)()\n",
    "        pop(')') or fail('expected \")\" to close subscript')\n",
    "        return Subscript(V, indexes) # E.g. 'A(I)' => Subscript('A', ['I'])\n",
    "    else: \n",
    "        return V                     # E.g. 'X3' "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`list_of` is tricky because it works at two different times. When I write `list_of(number)` in the grammar, this returns an object of class `list_of`. When this object is called (just as other grammatical categories like `variable` are called), the effect is that it will parse a list of `number`. That list can be empty (if there are no more tokens on the line), or can be a single number, or can be several numbers separated by comma tokens. I could have defined `list_of` as a function that returns a function, but I thought that it would be clearer to define it as a class, so I can clearly separate what happens at the two different times: first the `__init__` method determines what category to parse, and later the `__call__` method does the actual parsing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class list_of:\n",
    "    \"list_of(category) is a callable that parses a comma-separated list of <category>\"\n",
    "    def __init__(self, category): self.category = category\n",
    "    def __call__(self):\n",
    "        result = ([self.category()] if tokens else [])\n",
    "        while pop(','):\n",
    "            result.append(self.category())\n",
    "        return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: Top Level `parse`, and Handling Errors\n",
    "\n",
    "Most of the parsing action happens inside the function `statement()`, but at the very top level, `parse(program)` takes a program text (that is, a string), and parses each line by calling `parse_line`, sorting the resulting list of lines by line number. If we didn't have to handle errors, this would be simple:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def parse(program): return sorted(parse_line(line) for line in lines(program))\n",
    "\n",
    "def parse_line(line): global tokens; tokens = tokenizer(line); return statement()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To handle syntactic errors, I add to `parse_line` a `try/except` that catches exceptions raised by calls to `fail`. I acknowledge the interpreter isn't very thorough about handling all errors, and the error messages could be more helpful."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def parse_line(line):\n",
    "    \"Return a Stmt(linenumber, statement_type, arguments).\"\n",
    "    global tokens\n",
    "    tokens = tokenizer(line)\n",
    "    try:\n",
    "        stmt = statement()\n",
    "        if tokens: fail('extra tokens at end of line')\n",
    "        return stmt\n",
    "    except SyntaxError as err:\n",
    "        print(\"Error in line '{}' at '{}': {}\".format(line, ' '.join(tokens), err))\n",
    "        return Stmt(0, 'REM', []) # Return dummy statement\n",
    "        \n",
    "def fail(message): raise SyntaxError(message)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: Building A Representation of the Program\n",
    "\n",
    "A program is represented by various data structures: a list of statements, where each statement contains various components: subscripted variable references, user-defined functions, function calls, operation calls, variable names, numbers, and labels. Here I define these data structures with `namedtuple`s:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from collections import namedtuple, defaultdict, deque\n",
    "\n",
    "Stmt      = namedtuple('Stmt',      'num, typ, args')     # '1 GOTO 9' => Stmt(1, 'GOTO', 9)\n",
    "Subscript = namedtuple('Subscript', 'var, indexes')       # 'A(I)'     => Subscript('A', ['I'])\n",
    "Funcall   = namedtuple('Funcall',   'f, x')               # 'SQR(X)'   => Funcall('SQR', 'X')\n",
    "Opcall    = namedtuple('Opcall',    'x, op, y')           # 'X + 1'    => Opcall('X', '+', 1)\n",
    "ForState  = namedtuple('ForState',  'continu, end, step') # Data for FOR loop \n",
    "\n",
    "class Function(namedtuple('_', 'parm, body')):\n",
    "    \"User-defined function; 'DEF FNC(X) = X ^ 3' => Function('X', Opcall('X', '^', 3))\"\n",
    "    def __call__(self, value):                           \n",
    "        variables[self.parm] = value # Global assignment to the parameter\n",
    "        return evalu(self.body)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first four namedtuples should be self-explanatory. The next one, `ForState`, is used to represent the state of a `FOR` loop variable while the program is running, but does not appear in the static representation of the program.\n",
    "`Function` is used to represent the definition of a user defined function. When the user writes `\"DEF FNC(X) = X ^ 3\"`, we create an object with `Function(parm='X', body=Opcall('X', '^', 3))`, and whenever the program calls, say, `FNC(2)` in an expression, the call returns 8, and also assigns 2 to the *global* variable `X` (whereas in modern languages, it would temporarily bind a new *local* variable named `X`). BASIC has no local variables. Note the technique of making `Function` be a subclass of a `namedtuple`; we are then free to add the `__call__` method to the subclass."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: Grammar of `PRINT` Statements\n",
    "\n",
    "On page 26 of the manual, it appears that the grammar rule for `PRINT` should be `[list_of(expression)]`. But in section 3.1, **More about PRINT**, some complications are introduced:\n",
    "\n",
    "* Labels (strings enclosed in double quotes) are allowed, as well as expressions.\n",
    "* The `\",\"` is not a separator. A line can end with `\",\"`.\n",
    "* Optionally, `\";\"` can be used instead of `\",\"`.\n",
    "* Optionally, the `\",\"` or `\";\"` can be omitted&mdash;we can have a label immediately followed by an expression.\n",
    "\n",
    "The effect of a comma is to advance the output to the next column that is a multiple of 15 (and to a new line if this goes past column 100). The effect of a semicolon is similar, but works in multiples of 3, not 15. (Note that column numbering starts at 0, not 1.) Normally, at the end of a `PRINT` statement we advance to a new line, but this is not done if the statement ends in `\",\"` or `\";\"`. Here are some examples:\n",
    "\n",
    "* `10 PRINT X, Y`\n",
    "<br>Prints the value of `X` in column 0 and `Y` in column 15. Advances to new line.\n",
    "\n",
    "* `20 PRINT \"X =\", X`\n",
    "<br>Prints the string `\"X =\"` in column 0 and the value of `X` in column 15. Advances to new line.\n",
    "\n",
    "* `30 PRINT \"X = \" X`\n",
    "<br>Prints the string `\"X =\"` in column 0 and the value of `X` immediately after. Advances to new line.\n",
    "\n",
    "* `40 PRINT X; Y,`\n",
    "<br>Prints the value of `X` in column 0, and the value of `Y` in the column that is the first available multiple of 3 after that.\n",
    "For example, if `X` is `'0'`, then print `Y` in column 3, but if `X` is `'12345'`, then we've gone past column 3, so print `Y` in column 6.\n",
    "Then, because the statement ends in a comma, advance to the next column that is a multiple of 15, but not to a new line.\n",
    "\n",
    "That explanation was long, but the implementation is short (at least for the parsing part; later we will see the execution part):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def labels_and_expressions():\n",
    "    \"Parse a sequence of label / comma / semicolon / expression (for PRINT statement).\"\n",
    "    result = []\n",
    "    while tokens:\n",
    "        item = pop(is_label) or pop(',') or pop(';') or expression()\n",
    "        result.append(item)\n",
    "    return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: Grammar of Expressions\n",
    "\n",
    "Now for the single most complicated part of the grammar: the `expression`.  The biggest complication is operator precedence: the string `\"A + B * X + C\"` should be parsed as if it were `\"A + (B * X) + C\"`, and not as `\"(A + B) * (X + C),\"` because multiplication binds more tightly than addition. There are [many schemes](https://en.wikipedia.org/wiki/Operator-precedence_parser) for parsing expressions, I'll use  [an approach](https://groups.google.com/forum/#!topic/comp.compilers/ruJLlQTVJ8o) attributed to Keith Clarke.\n",
    "\n",
    "Like all our grammatical categories, calling `expression()` pops off some tokens and returns a data object. The first thing it does is parse one of five types of simple \"primary\" expressions: \n",
    "a number like `1.23`; \n",
    "a possibly-subscripted variable like `X` or `A(I)`;\n",
    "a function call like `SIN(X)`;\n",
    "a unary negation like `-X`;\n",
    "or a parenthesied expression like `(X + 1)`. \n",
    "\n",
    "Next, `expression` looks for infix operators. To parse `'X + 1'` as an expression, first `primary()` would parse `'X'`, then it would pop off the `'+'` operator, then a recursive call to `expression()` would parse `1`, and the results can then be combined into an `Opcall('X', '+', 1)`. If there are multiple infix operators, they can all be handled, as in `'X+1+Y+2'`, which gets parsed as `Opcall(Opcall(Opcall('X', '+', 1), '+', 'Y'), '+', 2)`.\n",
    "\n",
    "When there are multiple infix operators of *different* precedence, as in `\"A + B * X + C\"`, the trick is to know which operators are parsed by the top-level call to `expression`, and which by recursive calls. When we first call `expression()`, the optional parameter `prec` gets the default value, 1, which is the precedence of addition and subtraction. When `expression` makes a recursive call, it passes the precedence of the current operator, and we only parse off operator/expression pairs at an equal or higher precedence. So, in parsing `\"A + B * X + C\"`, when we pop off the `'*'` operator (which has precedence 2), we then recursively call `expression(2)`, which parses off an expression containing operators of precedence 2 or higher, which means the recursive call will parse `X`, but it won't pop off the `'+'`, because that is at a lower precedence. So we correctly get the structure `\"(A + ((B * X) + C)\"`.\n",
    "\n",
    "The function `associativity` ensures that the operator `'^'` is right associative, meaning `10^2^3` = `(10^(2^3))`, whereas the other operators are left associative, so `10-2-3` = `((10-2)-3)`.\n",
    "\n",
    "Here is the implementation of `expression`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def expression(prec=1): \n",
    "    \"Parse an expression: a primary and any [op expression]* pairs with precedence(op) >= prec.\"\n",
    "    exp = primary()                         # 'A' => 'A'\n",
    "    while precedence(peek()) >= prec:\n",
    "        op = pop()\n",
    "        rhs = expression(precedence(op) + associativity(op))\n",
    "        exp = Opcall(exp, op, rhs)          # 'A + B' => Opcall('A', '+', 'B')\n",
    "    return exp\n",
    "\n",
    "def primary():\n",
    "    \"Parse a primary expression (no infix op except maybe within parens).\"\n",
    "    if is_number(peek()):                   # '1.23' => 1.23 \n",
    "        return number()\n",
    "    elif is_varname(peek()):                # X or A(I) or M(I+1, J)\n",
    "        return variable()\n",
    "    elif is_funcname(peek()):               # SIN(X) => Funcall('SIN', 'X')\n",
    "        return Funcall(pop(), primary())\n",
    "    elif pop('-'):                          # '-X' => Funcall('NEG', 'X')\n",
    "        return Funcall('NEG', primary())\n",
    "    elif pop('('):                          # '(X)' => 'X'\n",
    "        exp = expression()\n",
    "        pop(')') or fail('expected \")\" to end expression')\n",
    "        return exp\n",
    "    else:\n",
    "        return fail('unknown expression')\n",
    "\n",
    "def precedence(op): \n",
    "    return (3 if op == '^' else 2 if op in ('*', '/', '%') else 1 if op in ('+', '-') else 0)\n",
    "\n",
    "def associativity(op): \n",
    "    return (0 if op == '^' else 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parsing: The Complete Parser in Action\n",
    "\n",
    "I've now written all the grammatical categories, so I can now safely instantiate the global variable `grammar` by calling `Grammar()`, and parse a program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Stmt(num=10, typ='REM', args=[None]),\n",
       " Stmt(num=11, typ='DATA', args=[[8.0, 4.0]]),\n",
       " Stmt(num=15, typ='READ', args=[['N0', 'P0']]),\n",
       " Stmt(num=20, typ='PRINT', args=[['\"N\"', ',']]),\n",
       " Stmt(num=25, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
       " Stmt(num=30, typ='PRINT', args=[['\"N ^\"', 'P', ',']]),\n",
       " Stmt(num=35, typ='NEXT', args=['P']),\n",
       " Stmt(num=40, typ='PRINT', args=[['\"SUM\"']]),\n",
       " Stmt(num=45, typ='LET', args=['S', 0.0]),\n",
       " Stmt(num=50, typ='FOR', args=['N', 2.0, 'N0', 1]),\n",
       " Stmt(num=55, typ='PRINT', args=[['N', ',']]),\n",
       " Stmt(num=60, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
       " Stmt(num=65, typ='LET', args=['S', Opcall(x='S', op='+', y=Opcall(x='N', op='^', y='P'))]),\n",
       " Stmt(num=70, typ='PRINT', args=[[Opcall(x='N', op='^', y='P'), ',']]),\n",
       " Stmt(num=75, typ='NEXT', args=['P']),\n",
       " Stmt(num=80, typ='PRINT', args=[['S']]),\n",
       " Stmt(num=85, typ='NEXT', args=['N']),\n",
       " Stmt(num=99, typ='END', args=[])]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grammar = Grammar()\n",
    "\n",
    "parse(program)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are some more tests:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'ok'"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def test_exp(text, repr):\n",
    "    \"Test that text can be parsed as an expression to yield repr, with no tokens left over.\"\n",
    "    global tokens\n",
    "    tokens = tokenizer(text)\n",
    "    return (expression() == repr) and not tokens\n",
    "    \n",
    "def test_parser():\n",
    "    assert is_funcname('SIN') and is_funcname('FNZ') # Function names are three letters\n",
    "    assert not is_funcname('X') and not is_funcname('')\n",
    "    assert is_varname('X') and is_varname('A2') # Variables names are one letter and an optional digit\n",
    "    assert not is_varname('FNZ') and not is_varname('A10') and not is_varname('')\n",
    "    assert is_relational('>') and is_relational('>=') and not is_relational('+')\n",
    "    \n",
    "    assert test_exp('A + B * X + C', Opcall(Opcall('A', '+', Opcall('B', '*', 'X')), '+', 'C'))\n",
    "    assert test_exp('A + B + X + C', Opcall(Opcall(Opcall('A', '+', 'B'), '+', 'X'), '+', 'C'))\n",
    "    assert test_exp('SIN(X)^2',      Opcall(Funcall('SIN', 'X'), '^', 2))\n",
    "    assert test_exp('10 ^ 2 ^ 3',    Opcall(10, '^', Opcall(2, '^', 3))) # right associative\n",
    "    assert test_exp('10 - 2 - 3',    Opcall(Opcall(10, '-', 2), '-', 3)) # left associative\n",
    "    assert test_exp('A(I)+M(I, J)',  Opcall(Subscript(var='A', indexes=['I']), '+', \n",
    "                                            Subscript(var='M', indexes=['I', 'J'])))\n",
    "    assert test_exp('X * -1',        Opcall('X', '*', Funcall('NEG', 1.0)))\n",
    "    assert test_exp('X--Y--Z',       Opcall(Opcall('X', '-', Funcall('NEG', 'Y')), \n",
    "                                            '-', Funcall('NEG', 'Z')))\n",
    "    assert test_exp('((((X))))',     'X')\n",
    "    return 'ok'\n",
    "\n",
    "test_parser()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Execution\n",
    "\n",
    "Now that we can parse programs, we're ready to execute them. I'll first define `run` to  `parse` and `execute` a program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def run(program): execute(parse(program))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function `execute(stmts)` first calls `preprocess(stmts)` to handle *declarations*: `DATA` and `DEF` statements that are processed one time only, before the program runs, regardless of their line numbers. (`DIM` statements are also declarations, but I decided that all lists/tables can have any number of elements, so I can ignore `DIM` declarations.)\n",
    "`execute` keeps track of the state of the program, partially in three globals:\n",
    "\n",
    "* `variables`: A mapping of the values of all BASIC variables (both subscripted and unsubscripted). <br>For example, `{'P1': 3.14, ('M', (1, 1)): 42.0}` says that the value of `P1` is `3.14` and `M(1, 1)` is `42.0`.\n",
    "* `functions`: A mapping of the values of all BASIC functions (both built-in and user-defined). <br>For example, `{'FNC': Function('X', Opcall('X', '^', 3)), 'SIN': math.sin}` \n",
    "* `column`: The column that `PRINT` will print in next.\n",
    "\n",
    "And also with these local variables:\n",
    "\n",
    "* `data`: a queue of all the numbers in `DATA` statements.\n",
    "* `pc`: program counter; the index into the list of statements.\n",
    "* `ret`: the index where a `RETURN` statement will return to.\n",
    "* `fors`: a map of `{varname: ForState(...)}` which gives the state of each `FOR` loop variable.\n",
    "* `goto`: a mapping of `{linenumber: index}`, for example `{10: 0, 20: 1}` for a program with two line numbers, 10 and 20.\n",
    "\n",
    "\n",
    "Running the program means executing the statement that the program counter (`pc`) is currently pointing at, repeatedly, until we hit an `END` or `STOP` statement (or a `READ` statement when there is no more data). \n",
    "The variable `pc` is initialized to `0` (the index of the first statement in the program) and is then incremented by `1` each cycle to go to the following statement; but a branching statement (`GOTO`, `IF`, `GOSUB`, or `RETURN`) can change the `pc` to something other than the following statement. Note that branching statements refer to line numbers, but the `pc` refers to the *index* number within the list of statements. The variable `goto` maps from line numbers to index numbers. In BASIC there is no notion of a *stack*, neither for variables nor return addresses. If I do a `GOSUB` to a subroutine that itself does a `GOSUB`, then the original return address is lost, because BASIC has only one return address register (which we call `ret`).\n",
    "\n",
    "The main body of `execute` checks the statement type, and takes appropriate action. All the statement types are straightforward, except for `FOR` and `NEXT`, which are explained a bit later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def execute(stmts): \n",
    "    \"Parse and execute the BASIC program.\"\n",
    "    global variables, functions, column\n",
    "    functions, data = preprocess(stmts) # {name: function,...}, deque[number,...]\n",
    "    variables = defaultdict(float) # mapping of {variable: value}, default 0.0\n",
    "    column    = 0                  # column to PRINT in next\n",
    "    pc        = 0                  # program counter\n",
    "    ret       = 0                  # index (pc) that a GOSUB returns to\n",
    "    fors      = {}                 # runtime map of {varname: ForState(...)}\n",
    "    goto      = {stmt.num: i       # map of {linenumber: index}\n",
    "                 for (i, stmt) in enumerate(stmts)}\n",
    "    while pc < len(stmts):\n",
    "        (_, typ, args) = stmts[pc] # Fetch and decode the instruction\n",
    "        pc += 1                    # Increment the program counter\n",
    "        if typ in ('END', 'STOP') or (typ == 'READ' and not data): \n",
    "            return\n",
    "        elif typ == 'LET':\n",
    "            V, exp = args\n",
    "            let(V, evalu(exp))\n",
    "        elif typ == 'READ':\n",
    "            for V in args[0]:\n",
    "                let(V, data.popleft())\n",
    "        elif typ == 'PRINT':\n",
    "            basic_print(args[0])\n",
    "        elif typ == 'GOTO':\n",
    "            pc = goto[args[0]]\n",
    "        elif typ == 'IF':\n",
    "            lhs, relational, rhs, dest = args\n",
    "            if functions[relational](evalu(lhs), evalu(rhs)):\n",
    "                pc = goto[dest]\n",
    "        elif typ == 'FOR':\n",
    "            V, start, end, step = args\n",
    "            variables[V] = evalu(start)\n",
    "            fors[V] = ForState(pc, evalu(end), evalu(step))\n",
    "        elif typ == 'NEXT':\n",
    "            V = args[0]\n",
    "            continu, end, step = fors[V]\n",
    "            if ((step >= 0 and variables[V] + step <= end) or\n",
    "                (step <  0 and variables[V] + step >= end)):\n",
    "                variables[V] += step\n",
    "                pc = continu\n",
    "        elif typ == 'GOSUB':\n",
    "            ret = pc\n",
    "            pc  = goto[args[0]]\n",
    "        elif typ == 'RETURN':\n",
    "            pc = ret"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the functions referenced by `execute`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import math\n",
    "import random\n",
    "import operator as op\n",
    "\n",
    "def preprocess(stmts):\n",
    "    \"\"\"Go through stmts and return two values extracted from the declarations: \n",
    "    functions: a mapping of {name: function}, for both built-in and user-defined functions,\n",
    "    data:      a queue of all the numbers in DATA statements.\"\"\"\n",
    "    functions = {  # A mapping of {name: function}; first the built-ins:\n",
    "        'SIN': math.sin, 'COS': math.cos, 'TAN': math.tan, 'ATN': math.atan, \n",
    "        'ABS': abs, 'EXP': math.exp, 'LOG': math.log, 'SQR': math.sqrt, 'INT': int,\n",
    "        '>': op.gt, '<': op.lt, '=': op.eq, '>=': op.ge, '<=': op.le, '<>': op.ne, \n",
    "        '^': pow, '+': op.add, '-': op.sub, '*': op.mul, '/': op.truediv, '%': op.mod,\n",
    "        'RND': lambda _: random.random(), 'NEG': op.neg}\n",
    "    data = deque() # A queue of numbers that READ can read from\n",
    "    for (_, typ, args) in stmts:\n",
    "        if typ == 'DEF':\n",
    "            name, parm, body = args\n",
    "            functions[name] = Function(parm, body)\n",
    "        elif typ == 'DATA':\n",
    "            data.extend(args[0])\n",
    "    return functions, data\n",
    "\n",
    "def evalu(exp):\n",
    "    \"Evaluate an expression, returning a number.\"\n",
    "    if isinstance(exp, Opcall):\n",
    "        return functions[exp.op](evalu(exp.x), evalu(exp.y))\n",
    "    elif isinstance(exp, Funcall):\n",
    "        return functions[exp.f](evalu(exp.x))\n",
    "    elif isinstance(exp, Subscript):\n",
    "        return variables[exp.var, tuple(evalu(x) for x in  exp.indexes)]\n",
    "    elif is_varname(exp):\n",
    "        return variables[exp]\n",
    "    else: # number constant\n",
    "        return exp\n",
    "    \n",
    "def let(V, value):\n",
    "    \"Assign value to the variable name or Subscripted variable.\"\n",
    "    if isinstance(V, Subscript): # A subsscripted variable\n",
    "        variables[V.var, tuple(evalu(x) for x in V.indexes)] = value \n",
    "    else:                        # An unsubscripted variable\n",
    "        variables[V] = value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Execution: `FOR/NEXT` Statements\n",
    "\n",
    "I have to admit I don't completely understand `FOR` loops. My questions include:\n",
    "\n",
    "* Are the `END` and `STEP` expressions evaluated once when we first enter the `FOR` loop, or each time through the loop?\n",
    "* After executing `\"FOR V = 1 TO 10\"`, is the value of `V` equal to 10 or 11? (Answer: the manual says 10.)\n",
    "* Does `\"FOR V = 0 TO -2\"` execute zero times? Or do all loops execute at least once, with the termination test done by the `NEXT`?\n",
    "* What if control branches into the middle of a loop and hits the `NEXT` statement, without ever executing the corresponding `FOR` statement? \n",
    "* What if control branches into the middle of a loop and hits the `NEXT` statement, without ever executing the corresponding `FOR` statement, but we have previously\n",
    "executed a `FOR` statement of a *different* loop that happens to use the same variable name?\n",
    "\n",
    "I chose a solution that is easy to implement, and correctly runs all the examples in the manual, but I'm not certain that my solution is true to the original intent. Consider this program:\n",
    "\n",
    "    10 PRINT \"TABLE OF SQUARES\"\n",
    "    20 LET N = 10\n",
    "    30 FOR V = 1 to N STEP N/5\n",
    "    40   PRINT V, V * V\n",
    "    50 NEXT V\n",
    "    60 END\n",
    "  \n",
    "    \n",
    "* When control hits the `\"FOR V\"` statement in line 30, I assign:\n",
    "<br>`variables['V'] = 1`\n",
    "<br>`    fors['V'] = ForState(continu=3, end=10, step=2)`\n",
    "<br>where `3` is the index of line 40 (the line right after the `FOR` statement); `10` is the value of `N`; and `2` is the value of `N/5`.\n",
    "* When control hits the `\"NEXT V\"` statement in line 50, I do the following:\n",
    "<br>Examine `fors['V']` to check if `V` incremented by the step value, `2`, is within the bounds defined by the end, `10`. \n",
    "<br>If it is, increment `V` and assign `pc` to be `3`, the `continu` value. \n",
    "<br>If not, continue on to the following statement, `60`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Execution: Printing\n",
    "\n",
    "We showed how to parse a `PRINT` statement with `labels_and_expressions()`; now it is time to execute a `PRINT` statement, printing each of the labels and expressions, and keeping track of what column to print at next, using the global variable `column`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def basic_print(items): \n",
    "    \"Print the items (',' / ';' / label / expression) in appropriate columns.\"\n",
    "    for item in items:\n",
    "        if item == ',':      pad(15)\n",
    "        elif item == ';':    pad(3)\n",
    "        elif is_label(item): print_string(item.replace('\"', ''))\n",
    "        else:                print_string(\"{:g} \".format(evalu(item)))\n",
    "    if (not items) or items[-1] not in (',', ';'):\n",
    "        newline()\n",
    "        \n",
    "def print_string(s): \n",
    "    \"Print a string, keeping track of column, and advancing to newline if at or beyond column 100.\"\n",
    "    global column\n",
    "    print(s, end='')\n",
    "    column += len(s)\n",
    "    if column >= 100: newline()\n",
    "        \n",
    "def pad(width): \n",
    "    \"Pad out to the column that is the next multiple of width.\"\n",
    "    while column % width != 0: \n",
    "        print_string(' ')\n",
    "\n",
    "def newline(): global column; print(); column = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Running the First Sample Program\n",
    "\n",
    "Let's re-examine, parse, and run our first sample program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "10 REM POWER TABLE\n",
      "11 DATA 8, 4\n",
      "15 READ N0, P0\n",
      "20 PRINT \"N\",\n",
      "25 FOR P = 2 to P0\n",
      "30   PRINT \"N ^\" P,\n",
      "35 NEXT P\n",
      "40 PRINT \"SUM\"\n",
      "45 LET S = 0\n",
      "50 FOR N = 2 TO N0\n",
      "55   PRINT N,\n",
      "60   FOR P = 2 TO P0\n",
      "65     LET S = S + N ^ P\n",
      "70     PRINT N ^ P,\n",
      "75   NEXT P\n",
      "80   PRINT S\n",
      "85 NEXT N\n",
      "99 END\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(program)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[Stmt(num=10, typ='REM', args=[None]),\n",
       " Stmt(num=11, typ='DATA', args=[[8.0, 4.0]]),\n",
       " Stmt(num=15, typ='READ', args=[['N0', 'P0']]),\n",
       " Stmt(num=20, typ='PRINT', args=[['\"N\"', ',']]),\n",
       " Stmt(num=25, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
       " Stmt(num=30, typ='PRINT', args=[['\"N ^\"', 'P', ',']]),\n",
       " Stmt(num=35, typ='NEXT', args=['P']),\n",
       " Stmt(num=40, typ='PRINT', args=[['\"SUM\"']]),\n",
       " Stmt(num=45, typ='LET', args=['S', 0.0]),\n",
       " Stmt(num=50, typ='FOR', args=['N', 2.0, 'N0', 1]),\n",
       " Stmt(num=55, typ='PRINT', args=[['N', ',']]),\n",
       " Stmt(num=60, typ='FOR', args=['P', 2.0, 'P0', 1]),\n",
       " Stmt(num=65, typ='LET', args=['S', Opcall(x='S', op='+', y=Opcall(x='N', op='^', y='P'))]),\n",
       " Stmt(num=70, typ='PRINT', args=[[Opcall(x='N', op='^', y='P'), ',']]),\n",
       " Stmt(num=75, typ='NEXT', args=['P']),\n",
       " Stmt(num=80, typ='PRINT', args=[['S']]),\n",
       " Stmt(num=85, typ='NEXT', args=['N']),\n",
       " Stmt(num=99, typ='END', args=[])]"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "parse(program)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "N              N ^2           N ^3           N ^4           SUM\n",
      "2              4              8              16             28 \n",
      "3              9              27             81             145 \n",
      "4              16             64             256            481 \n",
      "5              25             125            625            1256 \n",
      "6              36             216            1296           2804 \n",
      "7              49             343            2401           5597 \n",
      "8              64             512            4096           10269 \n"
     ]
    }
   ],
   "source": [
    "run(program)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "# Running More Programs\n",
    "\n",
    "Rather than put together a suite of unit tests for `execute`, I'll run integration tests&mdash;additional whole programs. I've also added a few assertions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4              -5.5 \n",
      "0.666667       0.166667 \n",
      "-3.66667       3.83333 \n"
     ]
    }
   ],
   "source": [
    "# Linear equation solver (page 3 and 19 of the manual)\n",
    "\n",
    "run('''\n",
    "10 READ A1, A2, A3, A4\n",
    "15 LET D = A1 * A4 - A3 * A2\n",
    "20 IF D = 0 THEN 65\n",
    "30 READ B1, B2\n",
    "37   LET X1 = (B1*A4 - B2 * A2) / D\n",
    "42   LET X2 = ( A1 * B2 - A3 * B1)/D\n",
    "55   PRINT X1, X2\n",
    "60 GOTO 30\n",
    "65 PRINT \"NO UNIQUE SOLUTION\"\n",
    "70 DATA 1, 2, 4\n",
    "80 DATA 2, -7, 5\n",
    "85 DATA 1, 3, 4, -7\n",
    "90 END\n",
    "''')\n",
    "\n",
    "assert variables['D'] != 0\n",
    "assert variables['X1'] == -11/3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "X VALUE        SINE           RESOLUTION\n",
      "1.6            0.999574       0.1 \n",
      "1.57           1              0.01 \n",
      "1.571          1              0.001 \n",
      "1.5708         1              0.0001 \n"
     ]
    }
   ],
   "source": [
    "# Find max(sin(x)) for 0 <= x <= 3 (page 25)\n",
    "\n",
    "run('''\n",
    "5 PRINT \"X VALUE\", \"SINE\", \"RESOLUTION\"\n",
    "10 READ D\n",
    "20   LET M = -1\n",
    "30   FOR X = 0 TO 3 STEP D\n",
    "40   IF SIN(X) <= M THEN 80\n",
    "50     LET X0 = X\n",
    "60     LET M = SIN(X)\n",
    "80   NEXT X\n",
    "85   PRINT X0, M, D\n",
    "90 GO TO 10\n",
    "95 DATA .1, .01, .001, .0001\n",
    "99 END\n",
    "''')\n",
    "\n",
    "assert abs(variables['X0'] - math.pi / 2) < 0.00001"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1              2              3              4              5              6              7         \n",
      "8              9              10             11             12             "
     ]
    }
   ],
   "source": [
    "# Printing (page 32)\n",
    "\n",
    "run('''\n",
    "10 FOR I = 1 TO 12\n",
    "20   PRINT I,\n",
    "30 NEXT I\n",
    "40 END''')\n",
    "\n",
    "assert variables['I'] == 12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "THIS PROGRAM COMPUTES AND PRINTS THE NTH POWERS\n",
      "OF THE NUMBERS LESS THAN OR EQUAL TO N FOR VARIOUS\n",
      "N FROM 1 TO 7\n",
      "\n",
      "N =   1  I^N:\n",
      "1              \n",
      "\n",
      "N =   2  I^N:\n",
      "1              4              \n",
      "\n",
      "N =   3  I^N:\n",
      "1              8              27             \n",
      "\n",
      "N =   4  I^N:\n",
      "1              16             81             256            \n",
      "\n",
      "N =   5  I^N:\n",
      "1              32             243            1024           3125           \n",
      "\n",
      "N =   6  I^N:\n",
      "1              64             729            4096           15625          46656          \n",
      "\n",
      "N =   7  I^N:\n",
      "1              128            2187           16384          78125          279936         823543    \n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Powers (page 33)\n",
    "\n",
    "run('''\n",
    " 5 PRINT \"THIS PROGRAM COMPUTES AND PRINTS THE NTH POWERS\"\n",
    " 6 PRINT \"OF THE NUMBERS LESS THAN OR EQUAL TO N FOR VARIOUS\"\n",
    " 7 PRINT \"N FROM 1 TO 7\"\n",
    " 8 PRINT\n",
    "10 FOR N = 1 TO 7\n",
    "15   PRINT \"N = \"; N; \"I^N:\"\n",
    "20   FOR I = 1 TO N\n",
    "30     PRINT I^N,\n",
    "40   NEXT I\n",
    "50   PRINT\n",
    "60   PRINT\n",
    "70 NEXT N\n",
    "80 END''')\n",
    "\n",
    "assert variables['N'] ** variables['I'] == 7 ** 7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1e+06 941192   884736   830584   778688   729000   681472   636056   592704   551368   512000   474552 \n",
      "438976   405224   373248   343000   314432   287496   262144   238328   216000   195112   175616   157464 \n",
      "140608   125000   110592   97336 85184 74088 64000 54872 46656 39304 32768 27000 21952 17576 13824 10648 \n",
      "8000  5832  4096  2744  1728  1000  512   216   64 8  0  "
     ]
    }
   ],
   "source": [
    "# Cubes (page 35; but with STEP -2 because I haven't tested negative step yet)\n",
    "\n",
    "run('''\n",
    "10 FOR I = 100 TO 0 STEP -2\n",
    "20   PRINT I*I*I;\n",
    "30 NEXT I\n",
    "40 END\n",
    "''')\n",
    "\n",
    "assert variables['I'] == 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TOTAL SALES FOR SALESMAN1     $180.5 \n",
      "TOTAL SALES FOR SALESMAN2     $211.3 \n",
      "TOTAL SALES FOR SALESMAN3     $131.65 \n",
      "TOTAL SALES FOR SALESMAN4     $166.55 \n",
      "TOTAL SALES FOR SALESMAN5     $169.4 \n"
     ]
    }
   ],
   "source": [
    "# Sales ledger (page 37; cleaned up a bit)\n",
    "\n",
    "run('''\n",
    "10  FOR I = 1 TO 3\n",
    "20    READ P(I)\n",
    "30  NEXT I\n",
    "40  FOR I = 1 TO 3\n",
    "50    FOR J = 1 TO 5\n",
    "60      READ S(I, J)\n",
    "70    NEXT J\n",
    "80  NEXT I\n",
    "90  FOR J = 1 TO 5\n",
    "100   LET S = 0\n",
    "110   FOR I = 1 TO 3\n",
    "120     LET S = S + P(I) * S(I, J)\n",
    "130   NEXT I\n",
    "140   PRINT \"TOTAL SALES FOR SALESMAN\"J, \"$\"S\n",
    "150 NEXT J\n",
    "190 DIM S(3, 5)\n",
    "200 DATA 1.25, 4.30, 2.50\n",
    "210 DATA 40, 20, 37, 29, 42\n",
    "220 DATA 10, 16, 3, 21, 8\n",
    "230 DATA 35, 47, 29, 16, 33\n",
    "300 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can look at the variables that have been stored for this program:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "defaultdict(float,\n",
       "            {'J': 5.0,\n",
       "             ('S', (3.0, 1.0)): 35.0,\n",
       "             ('S', (3.0, 4.0)): 16.0,\n",
       "             ('S', (3.0, 5.0)): 33.0,\n",
       "             ('S', (1.0, 2.0)): 20.0,\n",
       "             ('S', (1.0, 3.0)): 37.0,\n",
       "             ('S', (2.0, 3.0)): 3.0,\n",
       "             ('S', (2.0, 2.0)): 16.0,\n",
       "             ('S', (1.0, 5.0)): 42.0,\n",
       "             ('P', (1.0,)): 1.25,\n",
       "             ('S', (3.0, 3.0)): 29.0,\n",
       "             ('S', (2.0, 4.0)): 21.0,\n",
       "             'S': 169.4,\n",
       "             'I': 3.0,\n",
       "             ('P', (2.0,)): 4.3,\n",
       "             ('S', (3.0, 2.0)): 47.0,\n",
       "             ('S', (1.0, 1.0)): 40.0,\n",
       "             ('S', (1.0, 4.0)): 29.0,\n",
       "             ('S', (2.0, 5.0)): 8.0,\n",
       "             ('P', (3.0,)): 2.5,\n",
       "             ('S', (2.0, 1.0)): 10.0})"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1  2  4  9  5  0  5  3  7  7  3  8  6  4  4  6  7  4  5  4  8  8  7  9  4  1  0  3  5  2  3  4  5  3 \n",
      "6  5  3  1  0  9  5  6  1  4  5  7  3  1  4  3  6  3  7  2  3  0  2  2  7  5  0  8  7  9  3  9  5  7 \n",
      "5  0  1  9  6  3  7  5  0  0  5  7  3  5  9  3  2  6  1  2  1  9  1  7  0  9  0  6  9  6  7  2  "
     ]
    }
   ],
   "source": [
    "# Random number generator (page 40)\n",
    "\n",
    "run('''\n",
    "10 FOR I = 1 TO 100\n",
    "20   PRINT INT(10 * RND(X));\n",
    "30 NEXT I\n",
    "40 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "D  SIN(D)      COS(D)         SIN(D)^2 + COS(D)^2\n",
      "0  0           1              1 \n",
      "15 0.258819    0.965926       1 \n",
      "30 0.5         0.866025       1 \n",
      "45 0.707107    0.707107       1 \n",
      "60 0.866025    0.5            1 \n",
      "75 0.965926    0.258819       1 \n",
      "90 1           6.12323e-17    1 \n"
     ]
    }
   ],
   "source": [
    "# DEF example: table of SIN(X) and COS(X) in degrees (page 41, expanded some)\n",
    "\n",
    "run('''\n",
    " 5 PRINT \"D\"; \"SIN(D)\", \"COS(D)\", \"SIN(D)^2 + COS(D)^2\"\n",
    "20 LET P = 3.1415926535897932 / 180\n",
    "30 FOR X = 0 TO 90 STEP 15\n",
    "40   PRINT X; FNS(X), FNC(X), FNS(X)^2 + FNC(X)^2\n",
    "50 NEXT X\n",
    "97 DEF FNS(D) = SIN(D * P)\n",
    "98 DEF FNC(D) = COS(D * P)\n",
    "99 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "9              27             120 \n",
      "25             125            780 \n",
      "Z = 2615 \n"
     ]
    }
   ],
   "source": [
    "# GOSUB (page 43)\n",
    "\n",
    "run('''\n",
    "100 LET X = 3\n",
    "110 GOSUB 400\n",
    "120 PRINT U, V, W\n",
    "200 LET X = 5\n",
    "210 GOSUB 400\n",
    "215 PRINT U, V, W\n",
    "220 LET Z = U + 2*V + 3*W\n",
    "230 PRINT \"Z = \" Z\n",
    "240 STOP\n",
    "400 LET U = X*X\n",
    "410 LET V = X*X*X\n",
    "420 LET W = X*X*X*X + X*X*X + X*X + X\n",
    "430 RETURN\n",
    "440 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "21 \n"
     ]
    }
   ],
   "source": [
    "# Sum of non-negative multiples of 0.1 less than or equal to 2, two ways (page 47)\n",
    "\n",
    "run('''\n",
    " 5 LET S = 0\n",
    "10 LET N = 0\n",
    "20 LET S = S + N/10\n",
    "30   IF N >= 20 THEN 60\n",
    "40   LET N = N + 1\n",
    "50 GOTO 20\n",
    "60 PRINT S\n",
    "70 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "21 \n"
     ]
    }
   ],
   "source": [
    "run('''\n",
    "20 FOR N = 1 TO 20\n",
    "40   LET S = S + N/10\n",
    "50 NEXT N\n",
    "60 PRINT S\n",
    "70 END\n",
    "''')\n",
    "\n",
    "assert variables['S'] == sum(N/10 for N in range(1, 21))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Checking for Syntax Errors\n",
    "\n",
    "Here we show a collection of syntax errors, and the messages they generate:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error in line '1 X = 1' at 'X = 1': unknown statement type\n",
      "Error in line '2 GO TO JAIL' at 'J A I L': missing line number\n",
      "Error in line '3 FOR I = 1 ' at '': expected 'TO'\n",
      "Error in line '4 IF X > 0 & X < 10 GOTO 999' at '& X < 10 GOTO 999': expected 'THEN'\n",
      "Error in line '5 LET Z = (Z + 1' at '': expected \")\" to end expression\n",
      "Error in line '6 PRINT \"OH CANADA\", EH?' at '?': unknown expression\n",
      "Error in line '7 LET Z = +3' at '+ 3': unknown expression\n",
      "Error in line '8 LET X = Y ** 2' at '* 2': unknown expression\n",
      "Error in line '9 LET A(I = 1' at '= 1': expected \")\" to close subscript\n",
      "Error in line '10 IF A = 0 THEN 900 + 99' at '+ 99': extra tokens at end of line\n",
      "Error in line '11 NEXT A(I)' at '( I )': extra tokens at end of line\n",
      "Error in line '12 DEF F(X) = X ^ 2 + 1' at 'F ( X ) = X ^ 2 + 1': expected a function name\n",
      "Error in line '13 IF X != 0 THEN 999' at '! = 0 THEN 999': expected a relational operator\n",
      "Error in line '14 DEF FNS(X + 2*P1) = SIN(X)' at '+ 2 * P1 ) = SIN ( X )': expected ')'\n",
      "Error in line '15 DEF FNY(M, B) = M * X + B' at ', B ) = M * X + B': expected ')'\n",
      "Error in line '16 LET 3 = X' at '3 = X': expected a variable name\n",
      "Error in line '17 LET SIN = 7 * DEADLY' at 'SIN = 7 * D E A D L Y': expected a variable name\n",
      "Error in line '18 LET X = A-1(I)' at '( I )': extra tokens at end of line\n",
      "Error in line '19 FOR SCORE + 7' at 'C O R E + 7': expected '='\n",
      "Error in line '20 STOP IN NAME(LOVE)' at 'I N N A M E ( L O V E )': extra tokens at end of line\n",
      "Error in line '85 ENDURANCE.' at 'U R A N C E .': extra tokens at end of line\n",
      "ADD 2 + 2 = 4 \n"
     ]
    }
   ],
   "source": [
    "run('''\n",
    "1 X = 1\n",
    "2 GO TO JAIL\n",
    "3 FOR I = 1 \n",
    "4 IF X > 0 & X < 10 GOTO 999\n",
    "5 LET Z = (Z + 1\n",
    "6 PRINT \"OH CANADA\", EH?\n",
    "7 LET Z = +3\n",
    "8 LET X = Y ** 2\n",
    "9 LET A(I = 1\n",
    "10 IF A = 0 THEN 900 + 99\n",
    "11 NEXT A(I)\n",
    "12 DEF F(X) = X ^ 2 + 1\n",
    "13 IF X != 0 THEN 999\n",
    "14 DEF FNS(X + 2*P1) = SIN(X)\n",
    "15 DEF FNY(M, B) = M * X + B\n",
    "16 LET 3 = X\n",
    "17 LET SIN = 7 * DEADLY\n",
    "18 LET X = A-1(I)\n",
    "19 FOR SCORE + 7\n",
    "20 STOP IN NAME(LOVE)\n",
    "80 REMARKABLY, THE INTERPRETER\n",
    "81 REMEDIES THE ERRORS, AND THE PROPGRAM\n",
    "82 REMAINS AN EXECUTABLE ENTITY, UN-\n",
    "83 REMITTENTLY RUNNING, WITH NO\n",
    "84 REMORSE OR REGRETS, AND WITH GREAT\n",
    "85 ENDURANCE.\n",
    "98 PRINT \"ADD 2 + 2 = \" 2 + 2\n",
    "99 END\n",
    "''')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Longer Program: Life\n",
    "\n",
    "Now for a final, slightly more complicated example: Conway's Game of [Life](https://en.wikipedia.org/wiki/Conway's_Game_of_Life). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GEN 0 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  O  .  .  .  .  .  .  .  \n",
      ".  .  O  .  .  O  O  .  .  .  \n",
      ".  .  O  .  .  O  O  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 1 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  O  O  O  .  O  O  .  .  .  \n",
      ".  .  .  .  .  O  O  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 2 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  O  .  .  .  .  .  .  .  \n",
      ".  .  O  .  O  O  O  .  .  .  \n",
      ".  .  O  .  O  O  O  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 3 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  O  .  O  .  .  .  .  \n",
      ".  O  O  .  O  .  O  .  .  .  \n",
      ".  .  .  .  O  .  O  .  .  .  \n",
      ".  .  .  .  .  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 4 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  O  O  O  O  .  .  .  .  \n",
      ".  .  O  .  O  .  O  .  .  .  \n",
      ".  .  .  O  O  .  O  .  .  .  \n",
      ".  .  .  .  .  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 5 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  O  O  .  .  .  .  .  \n",
      ".  .  O  .  O  O  .  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  O  O  .  O  .  .  .  \n",
      ".  .  .  .  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 6 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  O  .  O  O  .  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  O  O  .  O  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 7 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  O  .  O  .  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  O  .  O  .  .  .  .  \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 8 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  O  O  .  O  O  .  .  .  \n",
      ".  O  O  O  .  O  O  O  .  .  \n",
      ".  .  O  O  .  O  O  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 9 \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  O  .  .  .  .  .  O  .  .  \n",
      ".  O  .  .  .  .  .  O  .  .  \n",
      ".  O  .  .  .  .  .  O  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n",
      "GEN 10 \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  O  O  O  O  O  .  .  .  \n",
      ".  O  O  .  .  .  O  O  .  .  \n",
      "O  O  O  .  .  .  O  O  O  .  \n",
      ".  O  O  .  .  .  O  O  .  .  \n",
      ".  .  O  O  O  O  O  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  \n"
     ]
    }
   ],
   "source": [
    "run('''\n",
    "100 REM CONWAY'S GAME OF LIFE\n",
    "\n",
    "102 REM G IS NUMBER OF GENERATIONS, \n",
    "104 REM M IS MATRIX SIZE (M X M)\n",
    "106 REM L(SELF, NEIGHBORS_ALIVE) IS 1 IFF CELL WITH THOSE COUNTS LIVES\n",
    "108 REM A(X, Y) IS 1 IFF CELL AT (X, Y) IS LIVE\n",
    "110 REM B(X, Y) GETS THE NEXT GENERATION\n",
    "\n",
    "120 READ G,      M,      L(0,3), L(1,3), L(1,2)\n",
    "121 DATA 10,     10,     1,      1,      1\n",
    "130 READ A(3,4), A(3,5), A(3,6), A(6,5), A(6,6), A(7,5), A(7,6)\n",
    "131 DATA 1,      1,      1,      1,      1,      1,      1\n",
    "\n",
    "150 REM MAIN LOOP: PRINT, THEN REPEAT G TIMES: UPDATE / COPY / PRINT\n",
    "155 LET I = 0\n",
    "160 GOSUB 700\n",
    "170 FOR I = 1 TO G\n",
    "180   GOSUB 300\n",
    "190   GOSUB 500\n",
    "200   GOSUB 700\n",
    "210 NEXT I\n",
    "220 STOP\n",
    "\n",
    "300 REM ========== UPDATE B = NEXT_GEN(A)\n",
    "310 FOR Y = 1 TO M\n",
    "320   FOR X = 1 TO M\n",
    "325     LET N = A(X-1,Y)+A(X+1,Y)+A(X,Y-1)+A(X,Y+1)+A(X-1,Y-1)+A(X+1,Y+1)+A(X-1,Y+1)+A(X+1,Y-1)\n",
    "330     LET B(X, Y) = L(A(X, Y), N)\n",
    "340   NEXT X\n",
    "350 NEXT Y\n",
    "360 RETURN\n",
    "\n",
    "500 REM ========== COPY A = B\n",
    "510 FOR Y = 1 TO M\n",
    "520   FOR X = 1 TO M\n",
    "530     LET A(X, Y) = B(X, Y)\n",
    "540   NEXT X\n",
    "550 NEXT Y\n",
    "560 RETURN\n",
    "\n",
    "700 REM ========== PRINT A\n",
    "705 PRINT \"GEN \" I\n",
    "710 FOR Y = 1 TO M\n",
    "720   FOR X = 1 TO M\n",
    "730     IF A(X, Y) = 1 THEN 760\n",
    "740       PRINT \".\";\n",
    "750       GOTO 770\n",
    "760     PRINT \"O\";\n",
    "770   NEXT X\n",
    "780   PRINT\n",
    "790 NEXT Y\n",
    "795 RETURN\n",
    "\n",
    "999 END \n",
    "''')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Actually, this was an assignment in my high school BASIC class. (We used a [slightly different](https://www.grc.com/pdp-8/docs/os8_basic_reference.pdf) version of BASIC.) Back then, output was on rolls of paper, and I thought it was wasteful to print only one generation per line. So I arranged to print multiple generations on the same line, storing them until it was time to print them out. But BASIC doesn't have three-dimensional arrays, so I needed to store several generations worth of data in one `A(X, Y)` value. Today, I know that that could be done by allocating one bit for each generation, but back then I don't think I knew about binary representation, so I stored one generation in each decimal digit. That means I no longer need two matrixes, `A` and `B`; instead, the current generation will always be the value in the one's place, the previous generation in the ten's place, and the one before that in the hundred's place. (Also, I admit I cheated: I added the mod operatoir, `%`, which did not appear in early versions of BASIC, just because it was useful for this program.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "GEN 0                            GEN 1                            GEN 2 \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  O  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  O  .  .  .  .  .  .  .  \n",
      ".  .  O  .  .  O  O  .  .  .  |  .  O  O  O  .  O  O  .  .  .  |  .  .  O  .  O  O  O  .  .  .  \n",
      ".  .  O  .  .  O  O  .  .  .  |  .  .  .  .  .  O  O  .  .  .  |  .  .  O  .  O  O  O  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      "GEN 3                            GEN 4                            GEN 5 \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  O  O  .  .  .  .  .  \n",
      ".  .  .  O  .  O  .  .  .  .  |  .  .  O  O  O  O  .  .  .  .  |  .  .  O  .  O  O  .  .  .  .  \n",
      ".  O  O  .  O  .  O  .  .  .  |  .  .  O  .  O  .  O  .  .  .  |  .  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  .  O  .  O  .  .  .  |  .  .  .  O  O  .  O  .  .  .  |  .  .  .  O  O  .  O  .  .  .  \n",
      ".  .  .  .  .  O  .  .  .  .  |  .  .  .  .  .  O  .  .  .  .  |  .  .  .  .  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      "GEN 6                            GEN 7                            GEN 8 \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  |  .  .  .  O  .  O  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  \n",
      ".  .  O  .  O  O  .  .  .  .  |  .  .  O  .  .  .  O  .  .  .  |  .  .  O  O  .  O  O  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  |  .  .  O  .  .  .  O  .  .  .  |  .  O  O  O  .  O  O  O  .  .  \n",
      ".  .  .  O  O  .  O  .  .  .  |  .  .  O  .  .  .  O  .  .  .  |  .  .  O  O  .  O  O  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  |  .  .  .  O  .  O  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n",
      "GEN 9                            GEN 10                           GEN 11 \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  |  .  .  O  .  .  .  O  .  .  .  \n",
      ".  .  O  .  .  .  O  .  .  .  |  .  .  O  O  O  O  O  .  .  .  |  .  O  .  .  .  .  .  O  .  .  \n",
      ".  O  .  .  .  .  .  O  .  .  |  .  O  O  .  .  .  O  O  .  .  |  O  .  .  .  O  .  .  .  O  .  \n",
      ".  O  .  .  .  .  .  O  .  .  |  O  O  O  .  .  .  O  O  O  .  |  O  .  .  O  .  O  .  .  O  .  \n",
      ".  O  .  .  .  .  .  O  .  .  |  .  O  O  .  .  .  O  O  .  .  |  O  .  .  .  O  .  .  .  O  .  \n",
      ".  .  O  .  .  .  O  .  .  .  |  .  .  O  O  O  O  O  .  .  .  |  .  O  .  .  .  .  .  O  .  .  \n",
      ".  .  .  O  O  O  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  |  .  .  O  .  .  .  O  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  O  .  .  .  .  .  |  .  .  .  O  O  O  .  .  .  .  \n",
      ".  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  |  .  .  .  .  .  .  .  .  .  .  \n"
     ]
    }
   ],
   "source": [
    "run('''\n",
    "100 REM CONWAY'S GAME OF LIFE\n",
    "\n",
    "102 REM G IS NUMBER OF GENERATIONS, \n",
    "104 REM M IS MATRIX SIZE (M X M)\n",
    "106 REM L(SELF, NEIGHBORS_ALIVE) IS 1 IFF CELL WITH THOSE COUNTS LIVES\n",
    "108 REM A(X, Y) STORES THE HISTORY OF THE CELL AT (X, Y); 1 MEANS LIVE,\n",
    "110 REM BUT WE STORE SEVERAL GENERATIONS: A(X, Y) = 100 MEANS THE CELL\n",
    "112 REM IS DEAD IN THE CURRENT AND PREVIOUS GENERATION (00), BUT LIVE IN THE\n",
    "114 REM GENERATION BEFORE THAT (1). WE STORE MULTIPLE GENERATIONS SO THAT\n",
    "116 REM WE CAN PRINT THEM OUT ON ONE LINE, SAVING SPACE/PAPER.\n",
    "\n",
    "120 READ G,      M,      L(0,3), L(1,3), L(1,2)\n",
    "122 DATA 11,     10,     1,      1,      1\n",
    "124 READ A(3,4), A(3,5), A(3,6), A(6,5), A(6,6), A(7,5), A(7,6)\n",
    "126 DATA 1,      1,      1,      1,      1,      1,      1\n",
    "\n",
    "130 REM FNA(N) = THE PREVIOUS GENERATION'S VALUE\n",
    "132 DEF FNA(N) = INT(N  / 10) % 10\n",
    "\n",
    "134 REM FNC(N) = THE GENERATION IN COLUMN C; FNC(123) = C FOR EACH C IN 1..3\n",
    "136 DEF FNC(N) = FNA(N / (10 ^ (2 - C)))\n",
    "\n",
    "150 REM MAIN LOOP: DO 3 UPDATES (2 FIRST TIME), THEN PRINT AND SHIFT\n",
    "160 FOR I = 1 TO G\n",
    "170   GOSUB 300\n",
    "175   IF I % 3 <> 2 THEN 200\n",
    "180     GOSUB 700\n",
    "190     GOSUB 800\n",
    "200 NEXT I\n",
    "210 STOP\n",
    "\n",
    "300 REM ========== UPDATE A: SHIFT OLD GENS LEFT; ADD IN NEW GEN\n",
    "310 FOR Y = 1 TO M\n",
    "320   FOR X = 1 TO M \n",
    "330     LET A(X, Y) = 10 * A(X, Y)\n",
    "340   NEXT X\n",
    "350 NEXT Y\n",
    "360 FOR Y = 1 TO M\n",
    "370   FOR X = 1 TO M \n",
    "380     LET N1 = FNA(A(X+1,Y-1)) + FNA(A(X+1,Y)) + FNA(A(X+1,Y+1)) + FNA(A(X,Y-1))\n",
    "390     LET N2 = FNA(A(X-1,Y-1)) + FNA(A(X-1,Y)) + FNA(A(X-1,Y+1)) + FNA(A(X,Y+1))\n",
    "400     LET S  = FNA(A(X, Y))\n",
    "410     LET A(X, Y) = A(X, Y) + L(S, N1 + N2)\n",
    "420   NEXT X\n",
    "430 NEXT Y\n",
    "440 RETURN\n",
    "\n",
    "700 REM ========== PRINT A (3 GENERATIONS ACROSS THE PAGE)\n",
    "705 PRINT \"GEN \" I-2, \" \", \"   GEN \" I-1, \" \", \"      GEN \" I\n",
    "710 FOR Y = 1 TO M\n",
    "715   FOR C = 1 TO 3\n",
    "720     FOR X = 1 TO M\n",
    "730       IF FNC(A(X, Y)) = 1 THEN 760\n",
    "740         PRINT \".\";\n",
    "750         GOTO 770\n",
    "760       PRINT \"O\";\n",
    "770     NEXT X\n",
    "772     IF C = 3 THEN 777\n",
    "775       PRINT \"|\";\n",
    "777   NEXT C\n",
    "780   PRINT\n",
    "790 NEXT Y\n",
    "795 RETURN\n",
    "\n",
    "800 REM ========== FORGET ALL BUT THE MOST RECENT GENERATION IN A\n",
    "810 FOR Y = 1 TO M\n",
    "820   FOR X = 1 TO M\n",
    "830     LET A(X, Y) = A(X, Y) % 10\n",
    "840   NEXT X\n",
    "850 NEXT Y\n",
    "860 RETURN\n",
    "\n",
    "999 END \n",
    "''')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}