"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"After the function: 0\n",
"Final add: 9\n"
]
}
],
"source": [
"total=0\n",
"def summing(arg1, arg2):\n",
" total = arg1+arg2\n",
" return total\n",
"\n",
"add = summing(1,2)\n",
"print(\"After the function:\", total)\n",
"\n",
"total=2\n",
"def sum2(arg1, arg2):\n",
" tt = arg1+arg2+total\n",
" return tt\n",
"add = sum2(3,4)\n",
"print(\"Final add:\",add)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pass by Reference vs Value\n",
"Python passes arguments per reference. That means they still point to the same location in memory outside and within a function.\n",
"\n",
"Do you understand what is going on in the following example?\n",
"\n",
"
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"
In changeme(), since li is a global object, it's values get modified despite the function not returning anything.
\n",
"
In changeme2(), li becomes a local variable. The function completely forgets about the argument.
\n",
"
Note the behaviour would be the same without passing li as an argument since li is a global object.
\n",
"
What would happen if li was a tuple instead of a list?
\n",
"
\n",
"
\n",
"
\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"List at the start: [1, 2]\n",
"List after changeme: [3, 2]\n",
"List after changeme2: [3, 2]\n"
]
}
],
"source": [
"def changeme(li):\n",
" li[0] = 3\n",
" return\n",
"\n",
"def changeme2(li):\n",
" li = [5,4]\n",
" return\n",
"\n",
"li=[1,2]\n",
"print(f\"List at the start: {li}\")\n",
"\n",
"changeme(li)\n",
"print(f\"List after changeme: {li}\")\n",
"\n",
"changeme2(li)\n",
"print(f\"List after changeme2: {li}\")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3, 2] [3, 2]\n"
]
}
],
"source": [
"def local_copy(li):\n",
" loc=li\n",
" loc[0]=3\n",
" return loc\n",
"\n",
"li=[1,2]\n",
"print(local_copy(li), li)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Read in files, Write to files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Open and close files"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To open a text file, there is the `open()` function. It accepts 2 arguments: name of the file and the opening mode for the file.\n",
"\n",
"| Modes | Meaning |\n",
"|:------:|--------|\n",
"| r | read-only mode |\n",
"| w | write-only mode |\n",
"| a | append to existing file |\n",
"| r+ | read and write mode |\n",
"\n",
"For binary files, you need to append `b` to the mode so Python knows to read or write byte objects.\n",
"To read in data, there are 3 methods: `read()`, `readline()`, `readlines()`. The only difference is the amount of data they read from the file. `read()` will only read the given number of charaters (or whole file), `readline()` reads the file line by line, `readlines()` reads in the entire file or a maximum number of bytes/characters.\n",
"Also, Python handily manages the conversion of end of line markers (`\\n` on Unix, `\\r\\n` on Windows) so you don't have to worry about it.\n",
"\n",
"To close a file, use the `close()` method."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Read from file"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# f.seek(0) allows to rewind the file to the start of the file after each read.\n",
"# Check what each output looks like. What is the difference between `f.read()` and `f.readlines()`?\n",
"f = open('test.txt','r')\n",
"whole_file = f.read()\n",
"f.seek(0)\n",
"first_line = f.readline()\n",
"f.seek(0)\n",
"whole2 = f.readlines()\n",
"f.close()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is an example text file.\n",
"Let's see what happens with csv-type files\n",
"50, 30, 40\n",
"70, 20, 30\n",
" This is an example text file.\n",
" ['This is an example text file.\\n', \"Let's see what happens with csv-type files\\n\", '50, 30, 40\\n', '70, 20, 30\\n']\n"
]
}
],
"source": [
"print(whole_file, first_line, whole2)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"This is an example text file.\\nLet's see what happens with csv-type files\\n50, 30, 40\\n70, 20, 30\\n\""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"whole_file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Write to file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Writing to a file is pretty symetrical to reading it in:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"f = open('my_file.txt','w')\n",
"f.write('Hello!')\n",
"lines=['Other line', 'One more']\n",
"f.writelines(lines)\n",
"f.close()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To check what's in the file, we use the iPython magic commands to call the `cat` bash command:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello!Other lineOne more"
]
}
],
"source": [
"!cat my_file.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hmm, something went wrong. Python needs you to specify those are separate lines by adding a newline symbol: `\\n`"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"f = open('my_file.txt','w')\n",
"f.write('Hello!\\n')\n",
"lines=['Other line\\n', 'One more\\n']\n",
"f.writelines(lines)\n",
"f.close()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello!\r\n",
"Other line\r\n",
"One more\r\n"
]
}
],
"source": [
"!cat my_file.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### With statement"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is also possible to use the `with` statement to work with files. This is commonly used as it provides better error handling and closes the file for you."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('test.txt','r') as f:\n",
" first_line = f.readline()\n",
"\n",
"print(first_line)\n",
"second_line = f.readline()\n",
"print(second_line)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise\n",
"Create a list of the numerical tabular values in test.txt. Make sure the values are of a numeric type (hint: check the Python builtin functions [here](https://docs.python.org/3/library/functions.html#built-in-functions)).\n",
"\n",
"Format the list as you wish:\n",
"\n",
"`[50,30,40,70,20,30]`\n",
"\n",
"`[[50,30,40],[70,20,30]]`\n",
"\n",
"`[[50,70],[30,20],[40,30]]`\n",
"\n",
"
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"
\n",
"with open('test.txt','r') as f:\n",
" # skip the header\n",
" head_length=2 # number of lines in the reader\n",
" for i in range(head_length):\n",
" f.readline()\n",
" \n",
" # Create a list to store the data\n",
" li = []\n",
" li2 = []\n",
" li3 = []\n",
" # Read each line and parse as needed.\n",
" for line in f.readlines():\n",
" tt = line.split(',')\n",
" tmp = [int(numb) for numb in tt]\n",
" li.extend(tmp)\n",
" li2.append(tmp)\n",
" if li3 == []:\n",
" li3=[[n] for n in tmp]\n",
" else:\n",
" for ind in range(len(tmp)):\n",
" li3[ind].append(tmp[ind])\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"--------\n",
"# Additional packages"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When you start Python very little gets loaded by default. This is to ensure a quick start of the interpreter and a lower memory usage. Obviously, you will need more than the default.\n",
"\n",
"Additionally, Python is open-sourced and as such lots of additional packages have been contributed over the years. These packages need to be installed before being able to use them.\n",
"\n",
"There are several ways to install packages. A simple one for individuals is Anaconda or Miniconda. That is what you used to prepare for this training (remember the instructions sent before the first training?). One advantage is that it handles dependencies on other packages and non-Python libraries for you. One disadvantage is that not all packages are shared via conda. It also creates a lot of files, which is not good for NCI.\n",
"\n",
"For working at NCI, the CMS maintain several Python environments to avoid duplications. These are quite extensive and we are open to installing more packages (as long as they are compatible with the existing environment). Please try those environments before installing your own. They are publicly opened, so not just for the Centre's folk.\n",
"\n",
"```\n",
"module use /g/data/hh5/public/modules\n",
"module load conda\n",
"```\n",
"\n",
"This will load the stable environment for Python 3, which is most likely the one you want to use. A list of the packages under this environment can be found with: `conda list`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load packages for use in your scripts or notebooks\n",
"You can load new packages at any point in your script. It's usually done at the top but it doesn't have to."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy # Most basic form. Imports the whole package\n",
"import numpy as np # Imports the whole package but give an alias to save on typing in your code\n",
"from matplotlib import pyplot as plt # Import just one part of the package.\n",
"import matplotlib.pyplot as plt # Does the same as above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# To use a package:\n",
"a = np.arange(20)\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Some useful packages\n",
"\n",
"From basic Python install:\n",
" -
os: operating system, e.g. environment variables, working directory, change permissions on files and directories.\n",
" -
pathlib: pathname manipulations, e.g. separate or join basename and file name, check file existence.\n",
" -
shutil: file operations, e.g. copy, move, delete files\n",
" -
glob: pathname pattern expansion, e.g. list of files matching: './[0-9].*'\n",
" -
argparse: parser for command-line options.\n",
" -
subprocess: to run a separate program.\n",
" \n",
"Additional packages:\n",
" -
numpy: arrays in Python\n",
" -
scipy: more maths functions (FFT, ODE, linear algebra, interpolation etc.)\n",
" -
pandas: the ultimate to work with time series\n",
" -
xarray: better arrays in Python (labelled arrays)\n",
" -
matplotlib: plotting in Python\n",
" -
cartopy: map projection and plotting in Python\n",
" -
dask: parallelisation "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}