{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Collate outside the notebook\n", "## Python files, input files, output file\n", "\n", "---\n", "- Set up a PyCharm project\n", "- Create a Python file\n", "- Run a script\n", " - In PyCharm\n", " - In the terminal\n", "- Input files\n", "- Output file\n", "- Exercise\n", "\n", "---\n", "\n", "Here it is another way to run the scripts you produced in the previous tutorials (note: even if technically they mean different things, we will use interchangeably the words code, script and program). This tutorial assumes that you went already through tutorials on Collate plain texts ([1](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit5/1_collate-plain-text.ipynb) and [2](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit5/2_collate-plain-text.ipynb)) and on the different [Collation ouputs](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit5/3_collation-outputs.ipynb). Everything that we will do here, is possible also in Jupyter notebook and certain section, as *Input files* is a recap of something already seen in the previous tutorials.\n", "\n", "In the [Command line tutorial](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit1/Command_line.ipynb), we have briefly seen how to run a Python program. In the terminal, type\n", "\n", " python myfile.py\n", "\n", "replacing “myfile.py” with the name of your Python program.\n", "\n", "### Again on file system hygiene: directory 'Scripts'\n", "In this tutorial, we will create Python programs. Where to save the files that you will create? Remember that [we created a directory for this workshop](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit1/Command_line.ipynb#Create-a-directory-for-this-workshop), called 'Workshop'. Now let's create a sub-directory, called 'Scripts', to store all our Python programs. \n", "\n", "---\n", "\n", "## Set up a PyCharm project\n", "\n", "If you are using PyCharm for these exercises it is worth setting up a project that will automatically save the files you create to the 'Scripts' directory you just created (see above). To do this open PyCharm and from the *File* menu select *New Project*. In the dialogue box that appears navigate to the 'scripts' directory you made for this workshop by clicking the button with '...' on it, on the right of the *location* box. Then click *create*. This will create a new project that will save all of the files to the folder you have selected.\n", "\n", "## Create a Python file\n", "\n", "Let's do this step by step. First of all, create a python file.\n", "\n", "- Open PyCharm, if you downloaded it before, or another text editor: Notepad++ for Windows or TextWrangler for Mac OS X.\n", "- Create a new file and copy paste the code we used before:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+---+-----+-------+-------+---------------------+------+------+\n", "| A | The | quick | brown | fox jumped over the | lazy | dog. |\n", "| B | The | - | brown | fox jumped over the | - | dog. |\n", "| C | The | bad | - | fox jumped over the | lazy | dog. |\n", "+---+-----+-------+-------+---------------------+------+------+\n" ] } ], "source": [ "from collatex import *\n", "collation = Collation()\n", "collation.add_plain_witness( \"A\", \"The quick brown fox jumped over the lazy dog.\")\n", "collation.add_plain_witness( \"B\", \"The brown fox jumped over the dog.\" )\n", "collation.add_plain_witness( \"C\", \"The bad fox jumped over the lazy dog.\")\n", "table = collate(collation)\n", "print(table)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Now save the file, as 'collate.py', inside the directory 'Scripts' (see above). If you setup a project in PyCharm then the files should automatically be saved in the correct place.\n", " \n", "## Run the script\n", "\n", "### In PyCharm\n", "\n", "- In Pycharm you can run the script using the button, or run from the menu.\n", "- The result will appear in a window at the bottom of the page.\n", "\n", "### In the terminal\n", "\n", "\n", "\n", "- Open the terminal and navigate to the folder where your script is, using the 'cd' command (again, refer to the [Command line tutorial](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit1/Command_line.ipynb), if you don't know what this means). Then type\n", "\n", " python collate.py\n", " \n", " If you are not in the directory where your script is, you should specify the path for that file. If you are in the Home directory, for example, the command would look like\n", " \n", " python Workshop/Scripts/collate.py\n", "\n", "- The result will appear below in the terminal.\n", "\n", "\n", "## Input files\n", "\n", "In the [first tutorial](http://nbviewer.jupyter.org/github/DiXiT-eu/collatex-tutorial/blob/master/unit5/1_collate-plain-text.ipynb), we saw how to use texts stored in files as witnesses for the collation. We used the `open` command to open each text file and appoint the contents to a variable with an appropriately chosen name; and we don't forget the `encoding=\"utf-8\"` bit!\n", "\n", "Let's try to do the same in our script 'collate.py', using the data in *fixtures/Darwin/txt* (only the first paragraph: \\_par1) and producing an output in XML/TEI. The code will look like this:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "