{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Notebook 1\n", "\n", "## Functions\n", "\n", "- Write a `function` that squares the input\n", "- Write a `function` that takes two numbers as input and returns the products of the two\n", "- Study the code below. Why does `product` not change?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "product = 0\n", "print('Product before: '+ str(product))\n", "\n", "def do_calculation(a,b):\n", " product = a * b\n", " return product\n", "\n", "\n", "r = do_calculation(3,2)\n", "print('Result: '+ str(r))\n", "print('Product after: '+ str(product))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Study the code below. Where does it get `num2` from? What are the risks with this?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def division(num1):\n", " result = num1/num2\n", " return result\n", "\n", "num2 = 2\n", "division(8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Write a `function` that takes a list as input and returns a list with only the even values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Reformat the below code to use functions as shown under (Note, this is the code from Day_2_Exercise_1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fh = open('../../downloads/genotypes_small.vcf', 'r', encoding = 'utf-8')\n", "\n", "wt = 0 \n", "het = 0\n", "hom = 0\n", "\n", "for line in fh:\n", " if not line.startswith('#'):\n", " cols = line.strip().split('\\t')\n", " chrom = cols[0] \n", " pos = cols[1] \n", " if chrom == '2' and pos == '136608646': \n", " for geno in cols[9:]: \n", " alleles = geno[0:3] \n", " if alleles == '0/0': \n", " wt += 1 \n", " elif alleles == '0/1':\n", " het += 1\n", " elif alleles == '1/1': \n", " hom += 1\n", " \n", "freq = (2*hom + het)/((wt+hom+het)*2)\n", "print('The frequency of the rs4988235 SNP is: '+str(freq)) \n", "\n", "fh.close()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def updateCounts(#inputs):\n", " # put code here\n", " return wt, het, hom\n", "\n", "\n", "def calculateFreq(#inputs):\n", " # put code here\n", " return freq\n", "\n", "\n", "def formatNicely(#inputs):\n", " # put code here\n", " return formatted\n", "\n", "\n", "\n", "fh = open('../../downloads/genotypes_small.vcf', 'r', encoding = 'utf-8')\n", "\n", "wt = 0 \n", "het = 0\n", "hom = 0\n", "\n", "for line in fh:\n", " if not line.startswith('#'):\n", " cols = line.strip().split('\\t')\n", " chrom = cols[0] \n", " pos = cols[1] \n", " if chrom == '2' and pos == '136608646': \n", " for geno in cols[9:]: \n", " wt, het, hom = updateCounts(geno[0:3], wt, het, hom)\n", "fh.close() \n", " \n", "freq = calculateFreq(wt, het, hom) \n", "print('The frequency of the rs4988235 SNP is: '+formatNicely(freq)) # print result with 2 decimals" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Put all the functions you wrote above into a separate file, and re-do the above assignment by importing the necessary functions from the file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Answers - Functions\n", "\n", "- Write a function that squares the input:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "25" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# first define the function\n", "def square(number): # this function takes one argument as input\n", " squared = number * number # do the calculation\n", " return squared # return the result \n", "\n", "square(5) # call the function with the input to test" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What you name your input arguments (in this example number) does not matter, that is just like naming any other variable. Just make sure to give it a reasonable name (writing def square(elephants) works, but probably you'll just confuse yourself) \n", "\n", "- Write a function that takes two numbers as input and returns the products of the two:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "18" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def product(num1, num2): # this function takes exactly two arguments as input\n", " result = num1 * num2\n", " return result\n", "\n", "product(3,6) # call the function with 2 arguments as input" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Study the code below. Why does product not change? \n", "- Study the code below. Where does it get num2 from? What are the risks with this?" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Product before: 0\n", "Result: 6\n", "Product after: 0\n" ] } ], "source": [ "product = 0\n", "print('Product before: '+ str(product))\n", "\n", "def do_calculation(a,b):\n", " product = a * b\n", " return product\n", "\n", "\n", "r = do_calculation(3,2)\n", "print('Result: '+ str(r))\n", "print('Product after: '+ str(product))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4.0\n" ] } ], "source": [ "def division(num1):\n", " result = num1/num2\n", " return result\n", "\n", "num2 = 2\n", "res = division(8)\n", "print(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above in the first example we first assign 0 to product, but inside the function we re-assign a*b to product, so why hasn't product changed when we print it in the end? Because any variable assigned within the function is a local variable that does not exist outside the function. On the other hand, any variable assigned outside a function is global, meaning functions can access them from the outside, but not the other way around. This can be seen in the second example above, where the function first looks for any local variables within the function called num2, and when it doesn't find anything looks outside for a global variable. The risks with this behaviour comes when re-using variable names inside the function that has already been used outside. If you forget to re-assign it inside the function it will not crash with an error message, but actually give you a results that might be false." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Write a function that returns a list with only even values:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 8, 16]\n" ] } ], "source": [ "def evenList(lst): # input list\n", " newList = [] # create an empty list\n", " for item in lst: # loop over list\n", " if item%2 == 0: # check if item in list is dividable with 2\n", " newList.append(item) # append to new list\n", " return newList \n", "\n", "myList = [1,2,5,8,9,13,16]\n", "myEvenList = evenList(myList) # save the new list\n", "print(myEvenList)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Reformat the below code to use functions as shown under (Note, this is the code from Day_2_Exercise_1):" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The frequency of the rs4988235 SNP is: 0.78\n" ] } ], "source": [ "def updateCounts(alleles, wt, het, hom):\n", " if alleles == '0/0': \n", " wt += 1 \n", " elif alleles == '0/1':\n", " het += 1\n", " elif alleles == '1/1': \n", " hom += 1\n", " return wt, het, hom\n", "\n", "\n", "def calculateFreq(wt, het, hom):\n", " freq = (2*hom + het)/((wt+hom+het)*2) \n", " return freq\n", "\n", "\n", "def formatNicely(freq):\n", " formatted = str(round(freq,2))\n", " return formatted\n", "\n", "\n", "\n", "fh = open('../../downloads/genotypes_small.vcf', 'r', encoding = 'utf-8')\n", "\n", "wt = 0 \n", "het = 0\n", "hom = 0\n", "\n", "for line in fh:\n", " if not line.startswith('#'):\n", " cols = line.strip().split('\\t')\n", " chrom = cols[0] \n", " pos = cols[1] \n", " if chrom == '2' and pos == '136608646': \n", " for geno in cols[9:]: \n", " wt, het, hom = updateCounts(geno[0:3], wt, het, hom)\n", "fh.close() \n", " \n", "freq = calculateFreq(wt, het, hom) \n", "print('The frequency of the rs4988235 SNP is: '+formatNicely(freq)) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Put all the functions you wrote above into a separate file, and re-do the above assignments by importing the necessary functions from the file\n", "\n", "If you put the functions from above into the file myFunctions.py located in the same folder, you would import the functions with: \n", "\n", "`from myFunctions import updateCounts, calculateFreq, formatNicely` \n", "\n", "If you have many functions to import, you can use: \n", "`from myFunctions import *`" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }