{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Basics\n", "\n", "In this Note, we will learn how to use Python as a programming language. This will be a summarized version of the official [Python tutorial](https://docs.python.org/3/tutorial/index.html). I omitted some concepts of Python. This is not because they are not important, but because we can do the things in simpler way if we use NumPy, SciPy, or AstroPy related packages/classes/modules. Before explaining what they are, I thought it is better to just introduce the core concepts of Python objects very-very briefly, and show how to use them. Then once you got familiar with them, you can `import numpy` and use it freely.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Packages\n", "\n", "\n", "Let me explain some of the useful packages of Python. This note is to explain basic notations of packages, and not a python tutorial. You can always come to this note when you are curious about a certain package that will be used in any of other notes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Importing\n", "\n", "When you import a package, you can do\n", "\n", "```python\n", " import [package_name]\n", "```\n", "\n", "For example, say a package named `pkg`:\n", "\n", "```python\n", " import pkg\n", "```\n", "\n", "Each package may have `sub` packages. Say `pkg` has subpackage `sub`, and you want to load this `sub` only, because none of other subpackages of `pkg` will be used. Then you can do it as\n", "\n", "```python\n", " from pkg import sub\n", "``` \n", "And if there is a *subsub* package named `ssub`, i.e., `ssub` $\\in$ `sub` $\\in$ `pkg`, then you can do\n", "\n", "```python\n", " from pkg.sub import ssub\n", "```\n", "\n", "That is, the `(larger).(smaller).(even_smaller)` style is used in Python to *locate* certain package.\n", "\n", "To use some functions defined in the package (module), e.g., the function named `sum` which sums all the inputs, you can call this function by the aforementioned \"locating\" method:\n", "\n", "```python\n", " ssub.sum(1, 2, 3)\n", "```\n", " \n", "If you did `import pkg` only, you have to specify the location of `ssub` too:\n", "\n", "```python\n", " pkg.sub.ssub.sum(1, 2, 3)\n", "```\n", "\n", "Sometimes you may not like the package name, since even `ssub` feels too long for you. Then use **`as`**:\n", "\n", "```python\n", " from pkg.sub import ssub as S\n", " S.sum(1, 2, 3)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \"Standard\" Packages\n", "\n", "First is the famous **NumPy**. [Official website](http://www.numpy.org/). Numpy is a package developed for faster numerical calculation in Python, and it is very widely used in almost all of python codes which contains any numerical calculation especially with larger data. It is customary to import it as **np**.\n", "\n", "```python\n", " import numpy as np\n", "```\n", "**SciPy** is another widely used one. [Official website](http://www.scipy.org/). It mainly contains modules and functions that will be used if you want to do some scientific calculations, e.g., image processing, curve fitting and optimization, integration and derivatives, FFT and special functions, etc. There are too many different possible usages, so I cannot state \"some widely used\" packages. While you are using Python and want to do certain job and google it, then you may find some SciPy module/function which does the job very efficiently.\n", "\n", "**matplotlib** is also a very widely used plotting package. [Official website](http://matplotlib.org/). It resembles that of MATLAB style. As time goes, you may have a myriad of times to plot better graphs, and you will very frequently encounter matplotlib. I will not explain about much of this here. Just keep in mind that\n", "\n", "```python\n", " import matplotlib.pyplot as plt\n", "``` \n", "is the most widely used convention than just \"`import matplotlib`\" or \"`import matplotlib.pyplot`\". Every time you see `plt. blahblah`, it usually means `pyplot` of `matplotlib`.\n", "\n", "\n", "**astropy** contains enormous amount of tools that you may use as an astronomer (or even as a physicist or a scientist), and I will not explain about it in detail here. [Official website](http://docs.astropy.org/en/stable/getting_started.html). Some widely used ones are\n", "\n", "```python\n", " from astropy import constants as c\n", " from astropy import units as u\n", " from astropy.io import fits\n", " from astropy.stats import sigma_clip\n", " ...\n", "```\n", "\n", "**photutils**, which is an affiliated package of astropy (as Ginga), is the one we will learn in this course. [Official website](http://photutils.readthedocs.io/)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Simple Grammar\n", "\n", "In the following examples, I will assume you know very basics of programming languages, e.g., at least what `==`, `!=`, or `a[3]` mean. If you are not familliar with any of these and cannot understand what the following description means, please refer to some good textbooks for Python, or other programming languages.\n", "\n", "### Data Types\n", "\n", "As many other languages, we have **Data Types** in Python. The ones you will use for most of the time are `int` == integer, `float` == real numbers, and `str` == string (alphabets/letters). The type can be checked by the function `type(~~~)`. A lot of other types are there, but let me show you these first, and you will learn all others by yourselves very easily.\n", "\n", "By the way, unlike many traditional languages, you don't have to specify the datatype when you declare a variable in Python." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n", "3.14\n", "cc\n", "<class 'int'>\n", "<class 'float'>\n", "<class 'str'>\n" ] } ], "source": [ "a = 3 # integer\n", "b = 3.14 # float (real number)\n", "c = 'cc' # string\n", "\n", "print(a) # will print 3\n", "print(b) # will print 3.14\n", "print(c) # will print cc\n", "print(type(a)) # the data types of a, b, c\n", "print(type(b))\n", "print(type(c))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One very unusual thing for those who are familiar to other languages, even Python 2, is that the integer division now results in real number:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.6\n", "0.6\n", "True\n" ] } ], "source": [ "a = 3/5 # In C/Python 2/ ..., this should be 0\n", "b = 3./5. # and this is 0.6\n", "print(a)\n", "print(b)\n", "print(a==b) # \"Is a equals b?\" ==> answer will be either True OR False." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course `True` and `False` have types, called `bool` == boolean:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<class 'bool'>\n", "<class 'bool'>\n" ] } ], "source": [ "print(type(True))\n", "print(type(False))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The indexing starts from 0 in Python. Let me make three new data types to show this clearly." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A: 0 1 2\n", "B: 0 1 2\n", "C: 0 1 2\n", "ABC: [0, 1, 2] (0, 1, 2) [0 1 2]\n", "type: <class 'list'> <class 'tuple'> <class 'numpy.ndarray'>\n" ] } ], "source": [ "import numpy as np\n", "\n", "A = [0,1,2] # classical Python \"list\"\n", "B = (0,1,2) # classical Python \"tuple\"\n", "C = np.array((0,1,2)) # numpy n-dimensional array\n", "print('A: ', A[0], A[1], A[2])\n", "print('B: ', B[0], B[1], B[2])\n", "print('C: ', C[0], C[1], C[2])\n", "print('ABC: ', A, B, C)\n", "print('type: ', type(A), type(B), type(C))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may ask why I used double parentheses for `C`. This is because `np.array` gets not only the elements (in this case the elements are 1, 2, 3), but also other optional arguments. In the following example, you may understand why numpy does not accept `np.array(0,1,2)`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 'test']\n", "0 1 test\n", "<class 'int'> <class 'int'> <class 'str'>\n" ] } ], "source": [ "C_obj = np.array([0,1,'test'], dtype='object')\n", "print(C_obj)\n", "print(C_obj[0], C_obj[1], C_obj[2])\n", "print(type(C_obj[0]), type(C_obj[1]), type(C_obj[2]))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the above example, an option `dtype='object'` made it possible to save not only the same kind (e.g., three integers or three strings), but different mixed data types into one array. So now it's clear: If you use `np.array(0,1,2)`, Python understands `0` as the element (0-dimensional), but it cannot understands what are `1` and `2`, which are followed by commas. Thus, an error occurs (verify).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Conditions/Loops\n", "\n", "There are three main conditions/loops you may use frequently: `if`, `for`, and `while`. You can utilize `break` and `continue` too. I personally do not use `while` very much, but it's just my personal preference as a non-expert." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Different!\n", "0 False 1 True 2 False 3 False 4 False 5 False 6 False 7 False 8 False \t'Oh it's NEIN!'\n", "\n", "\n", "0 1.0\n", "1 1.0\n", "2 1.0\n", "3 1.0\n", "4 1.0\n", "j reached 5\n" ] } ], "source": [ "x, y, z = 3, 3.14, 10-7\n", "L1 = np.ones((10)) # 10 by 1 sized numpy array, filled with 1\n", "L0 = np.zeros((10)) # 10 by 1 sized numpy array, filled with 0\n", "L = np.arange(0, 10, 1) # numpy array with elements [0, 10) with interval 1\n", "\n", "if (x != y):\n", " print('Different!')\n", "\n", "for i in range(0,10):\n", " if (L[i] == 9):\n", " print(\"\\t'Oh it's NEIN!'\")\n", " continue\n", " print(L[i], end=' ') # option \"end\" sets the ending part for each print; test many cases by yourself!\n", " print(L[i]==L1[i], end=' ')\n", " \n", "print(\"\\n\")\n", "\n", "j=0\n", "while (1): # identical to while(True)\n", " print(j, L1[j])\n", " j += 1\n", " if(j==5):\n", " print(\"j reached 5\")\n", " break\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why Numpy Array?\n", "\n", "Details: [Here](https://docs.scipy.org/doc/numpy/user/quickstart.html)\n", "\n", "You may wonder why we need numpy array, while there is traditional Python list or tuple. The following examples will partially answer this question by showing how powerful numpy array is compared to that of usual list/tuple, when we want to perform scientific calculations. Unlike normal Python list, calculation time has already been optimized for NumPy, so it's much faster when there are much more data." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "num1:\n", "[ 0 2 4 6 8 10]\n", "num2:\n", "[ 0.78143084 0.31859395 0.94417048 0.19161944 0.19672851 0.69199476]\n", "\n", "num1^2:\n", "[ 0 4 16 36 64 100]\n", "\n", "num1^(0.5):\n", "[ 0. 1.41421356 2. 2.44948974 2.82842712 3.16227766]\n", "\n", "num1 in 2 x 3 form:\n", "[[ 0 2 4]\n", " [ 6 8 10]]\n", "\n", "num3's row 0:\n", "[0 2 4]\n", "\n", "num3's min/max:\n", "0 10\n", "(array([1]), array([2]))\n", "\n", "num1's selected elements:\n", "[2 6 8]\n", "\n", "num2's selected elements:\n", "[ 0.78143084 0.31859395 0.94417048 0.69199476]\n" ] } ], "source": [ "num1 = np.arange(0, 11, 2) # generate numpy array [0, 11) with interval 2\n", "num2 = np.random.rand(6) # generate six random numbers (0,1)\n", "\n", "# Basics\n", "print(\"num1:\")\n", "print(num1)\n", "print(\"num2:\")\n", "print(num2)\n", "print(\"\\nnum1^2:\") # \"\\n\" means \"new line\".\n", "print(num1**2) # Square each element\n", "print(\"\\nnum1^(0.5):\")\n", "print(np.sqrt(num1)) # Square root of each element\n", "print(\"\\nnum1 in 2 x 3 form:\")\n", "print(num1.reshape(2,3)) # reshape the num1 into 2 by 3 form. \n", "# Identical to np.reshape(num1, (2,3))\n", "num3 = np.reshape(num1, (2,3))\n", "print(\"\\nnum3's row 0:\")\n", "print(num3[0,:]) # row 0, column all (:)\n", "print(\"\\nnum3's min/max:\")\n", "print(num3.min(), num3.max())\n", "# Identical to np.min(num3), np.max(num3)\n", "print(np.where(num3 == num3.max())) # Show at which position the maximum occurs\n", "\n", "# MASKING\n", "tf = np.array([False, True, False, True, True, False])\n", "print(\"\\nnum1's selected elements:\")\n", "print(num1[tf])\n", "\n", "tf = (num2*10 > 3) # Select only elements larger than 3 from num2*10\n", "# RHS is identical to just \"num2*10>3\"\n", "print(\"\\nnum2's selected elements:\")\n", "print(num2[tf])\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Complicated Calculations\n", "\n", "Some rather tedious/complicated calculations should be done from time to time. You may want to develop your own code/function and use it to effectively conduct certain tasks. I will briefly show you how you can define some functions." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "[1 4 9]\n" ] } ], "source": [ "def function_f(x):\n", " return x**2\n", "\n", "a=1\n", "b=np.array([1,2,3])\n", "print(function_f(a))\n", "print(function_f(b))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also set some optional arguments with giving them default values when defining:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "[ 1. 1.41421356 1.73205081]\n" ] }, { "ename": "ValueError", "evalue": "mode should be either 'square' or 'sqrt'!", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m<ipython-input-9-5c3820c2f2bc>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunction_g\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# will do square by defailt\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunction_g\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'sqrt'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# will do sqrt\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 13\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfunction_g\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmode\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m'test'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# will raise \"ValueError\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m<ipython-input-9-5c3820c2f2bc>\u001b[0m in \u001b[0;36mfunction_g\u001b[0;34m(x, mode)\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m**\u001b[0m\u001b[0;36m0.5\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"mode should be either 'square' or 'sqrt'!\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mValueError\u001b[0m: mode should be either 'square' or 'sqrt'!" ] } ], "source": [ "def function_g(x, mode='square'):\n", " if mode=='square':\n", " return x**2\n", " elif mode=='sqrt':\n", " return x**0.5\n", " else:\n", " raise ValueError(\"mode should be either 'square' or 'sqrt'!\")\n", "\n", "a=1\n", "b=np.array([1,2,3])\n", "print(function_g(a)) # will do square by defailt\n", "print(function_g(b, mode='sqrt')) # will do sqrt\n", "print(function_g(b, mode='test')) # will raise \"ValueError\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember \"The Zen of Python\"? Errors should never pass silently!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding Documentations\n", "\n", "As you now can import any packages, you can use the default functions in the packages. Polynomial fitting, sigma clipping, ..., anything you need is most likely has already been constructed via NumPy/SciPy/AstroPy developers. If you want to use any of these, google the keywords like \"python sigma clipping\". Then you will see numpy or scipy documentation. **BUT BE CAREFUL THAT YOU ARE LOOKING AT THE DOCUMENTATION FOR THE CORRESPONDING VERSION OF THE PACKAGE!!!** If you are looking at Astropy version 0.2, while you're using v1.3, for instance, this may be a problematic situation.\n", "\n", "Most of the documentation contains some core examples to let you understand things correctly, and you can see the source codes, if you want (there are always a '[source]' button, and you'll be redirected to GitHub repository).\n", "\n", "Sometimes you may not be able to find comprehensive online documentation. Then use the **`docstring`**. For instance, if the function or module has the name `package.sub`, then it is most likely that it has its own docstring, and you can print it by using `__doc__`:\n", "\n", "```python\n", "from package import sub\n", "print(sub.__doc__)\n", "\n", "```" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }