{ "cells": [ { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "# Introduction to \"Python\" and \"IPython\" using \"Jupyter\"" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "**Python** is a powerful and easy to use programming language. It has a large community of developers and given its open source nature, you can find many solutions, scripts, and help all over the web. It is easy to learn and code, and faster than other high-level programming languages...and did I mention it is _free_ because it is **open-source** \n", "\n", "**IPython** is a very powerful extension to Python that provides:\n", "Powerful interactive shells (terminal, Qt-based and Notebooks based on [Jupyter](http://jupyter.org/)).\n", "\n", "* A browser-based notebook with support for code, text, mathematical expressions, inline plots and other rich media.\n", "* Support for interactive data visualization and use of GUI toolkits.\n", "* Flexible, embeddable interpreters to load into your own projects.\n", "* Easy to use, high performance tools for parallel computing.\n", "\n", "**Jupyter** is an open-source project that provides open-standards, and services for interactive computing across dozens of programming languages, including ``Python``, ``R``, ``Stata`` and many others used by economists." ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "# Getting Python, IPython, R, and Jupyter" ] }, { "cell_type": "markdown", "metadata": { "tags": [], "user_expressions": [] }, "source": [ "You can download and install ``Python`` and its packages for free for your computer from [Python.org](http:www.python.org). While this is the official site, which offers the basic installer and you can try do add any packages you require yourself, a much easier approach, which is almost foolproof is to use [Continuum Anaconda](https://www.continuum.io/downloads) or [Enthought Canopy](https://www.enthought.com/products/canopy). Both of these distributions offer academic licenses ([Canopy](https://www.enthought.com/products/canopy/academic/)), which allow you to use a larger set of packages. Similarly, you can download ``R`` from the [r-project](https://www.r-project.org/) website. \n", "\n", "I personally have switched to using [Continuum Anaconda](https://www.continuum.io/downloads) since it make installing all the packages and software I use much easier. You can follow the instructions below or better yet follow the instructions on the [Computation Page](https://econgrowth.github.io/pages/Computation.html) of my [Economic Growth and Comparative Development Course](https://econgrowth.github.io/).\n", "\n", "\n", "## Installing (I)Python & Jupyter\n", "The easiest and most convenient way to install a working version of IPython with all the required packages and tools is using [Continuum's Anaconda Distribution](https://www.anaconda.com/distribution/). You can install following the instructions in that website, or if you can just run [this script (Mac/Linux)](https://www.dropbox.com/s/6st528ethbkmvv2/CondaInstall.sh?dl=0). After installing the latest version of Anaconda, add the ``Anaconda/bin`` directory to your ``PATH`` variable. \n", "\n", "To create an environment useful for these notebooks, in your terminal execute\n", "\n", "```bash\n", "conda create --name GeoPython3env -c conda-forge -c r -c mro --override-channels python=3.9 georasters geopandas pandas spatialpandas statsmodels xlrd networkx ipykernel ipyparallel ipython ipython_genutils ipywidgets jupyter jupyterlab kiwisolver matplotlib-base matplotlib scikit-image scikit-learn scipy seaborn geoplot geopy geotiff pycountry nb_conda_kernels stata_kernel nltk\n", "```\n", "\n", "This should create an environment with most of the packages we need. We can always install others down the road.\n", "\n", "To start using one of the environment you will need to exectute the following command\n", "\n", "```bash\n", "source activate GeoPython3env\n", "```\n" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "#### Note\n", "\n", "I assume you have followed the steps above and have installed ``Anaconda``. Everything that is done should work on any distribution that has the required packages, since the Python scripts should run (in principle) on any of these distributions. \n", "\n", "We will use IPython as our computing environment." ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "## Let's get started" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Once you have your Python distribution installed you'll be ready to start working. You have various options:\n", "\n", "* Open the ``Canopy`` program and work there\n", "* Open ``Anaconda Navigator`` and open one of the apps from there (`python`, `ipython`, `jupyter console`, `jupyter notebook`, `jupyter lab`, ``R``, `Stata` \n", "* From the Terminal prompt (command-line in Windows) execute one of the following commands:\n", " - `ipython`\n", " - `jupyter console`\n", " - `jupyter qtconsole`\n", " - `jupyter notebook`\n", " - `jupyter lab`\n", "\n", "While theses last are all using IPython, each has its advantages and disadvantages. You should play with them to get a better feeling of which you want to use for which purpose. In my own research I usually use a text editor ([TextMate](http://macromates.com/), [Atom](https://atom.io/), [Sublime](http://www.sublimetext.com/)) and the `jupyter qtconsole` or the ``jupyter notebook``. To see the power of ``Jupter notebooks`` (see this excellent and in-depth [presentation](http://youtu.be/xe_ATRmw0KM) by its creators). As you will see, this might prove an excellent environment to do research, homework, replicate papers, etc." ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "#### Note\n", "\n", "You can pass some additional commands to `ipython` in order to change colors and rendering of plots. I usually use `jupyter qtconsole --color=linux --pylab=inline`. You can create profiles to manage many options within `IPython` and `JuPyter`." ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "# First steps" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let's start by running some simple commands at the prompt to do some simple computations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1+1-2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3*2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "-1**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3*(3-2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3*3-2" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Notice that Python obeys the usual orders for operators, so exponentiation before multiplication/division, etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1/2" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "If you are in `Python 2.7` you will notice that this answer is wrong if $1,2\\in\\mathbb{R}$, but Python thinks they are integers, so it forces and integer. In order to have a more natural behavior of division we need" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from __future__ import division\n", "1/2" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "#### Note\n", "It is a good idea to include this among the packages to be imported by default" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "### Getting help" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "So what else can we do? Where do we start if we are new? You can use `?` or `help()` to get help." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want information about a command, say `mycommand` you can use `help(mycommand)`, `mycommand?` or `mycommand??` to get information about how it is used or even see its code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(sum)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum??" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "#### Variables, strings, and other objects" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "We can print information" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Hello World!')" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "We can also create variables, which can be of various types" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = 1\n", "b = 2\n", "a+b" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "`a` and `b` now hold numerical values we can use for computing" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = [1, 2]\n", "d = [[1, 2], [3, 4]]\n", "print('c=%s' % c)\n", "print('d=%s' % d)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Notice that we have used `%s` and `%` to let Python know we are passing a string to the print function.\n", "\n", "What kind of variables are `c` and `d`? They look like vectors and matrices, but..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(' a * c = %s' % (a * c))\n", "print(' b * d = %s' % (b * d))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c*d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Actually, Python does not have vectors or matrices directly available. Instead it has lists, sets, arrays, etc., each with its own set of operations. We defined `c` and `d` as list objects" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(c)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(d)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(a)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Luckily Python has a powerful package for numerical computing called [Numpy](http://www.numpy.org/). " ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "### Extending Python's Functionality with Packages" ] }, { "cell_type": "markdown", "metadata": { "tags": [], "user_expressions": [] }, "source": [ "In order to use a package in Python or IPython, say `mypackage`, you need to import it, by executing \n", " \n", " import mypackage\n", "\n", "After executing this command, you will have access to the functions and objects defined in `mypackage`. For example, if `mypackage` has a function `squared` that takes a real number `x` and computes its square, we can use this function by calling `mypackage.squared(x)`. Since the name of some packages might be too long, your can give them a nickname by importing them instead as\n", " \n", " import mypackage as myp\n", "\n", "so now we could compute the square of `x` by calling `myp.squared(x)`.\n", "\n", "We will see various packages that will be useful to do computations, statistics, plots, etc.\n", "\n", "IPython has a command that imports Numpy and Matplotlib (Python's main plotting package). Numpy is imported as `np` and Matplotlib as `plt`. One could import these by hand by executing\n", "\n", " import numpy as np\n", " import matplotlib as plt\n", " \n", "but the creators of IPython have optimized the interaction between these packages by running the following command:\n", "\n", " %pylab" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pylab?" ] }, { "cell_type": "markdown", "metadata": { "tags": [], "user_expressions": [] }, "source": [ "I do recommend using the `--no-import-all` option in order to ensure you do not contaminate the namespace. Instead it might be best to use\n", " \n", " %pylab --no-import-all\n", " %matplotlib" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pylab --no-import-all\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np?" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let us now recreate `c` and `d`, but as Numpy arrays instead." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ca = np.array(c)\n", "da = np.array(d)\n", "print('c = %s' % c)\n", "print('d = %s' % d)\n", "print('ca = %s' % ca)\n", "print('da = %s' % da)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "We could have created them as matrices intead. Again how you want to cerate them depends on what you will be doing with them. See here for an explanation of the differences between Numpy arrays and matrices." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cm = np.matrix(c)\n", "dm = np.matrix(d)\n", "print('cm = %s' % cm)\n", "print('dm = %s' % dm)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let's see some information about these...(this is a good moment to show tab completion...a _wonderful_ feature of IPython, which is not avalable if Python)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cm.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ca.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dm.diagonal()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da.cumsum()" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let's try again some operations on our new arrays and matrices" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cm*dm" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ca" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ca*da" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ca.dot(da)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "We can create special matrices using Numpy's functions and classes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(np.ones((3,4)))\n", "print(np.zeros((2,2)))\n", "print(np.eye(2))\n", "print(np.ones_like(cm))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.uniform(-1,1,10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#np.random.seed(123456)\n", "x0 = 0\n", "x = [x0]\n", "[x.append(x[-1] + np.random.normal() ) for i in range(500)]\n", "plt.plot(x)\n", "plt.title('A simple random walk')\n", "plt.xlabel('Period')\n", "plt.ylabel('Log Income')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "### Extending Capabilities with Functions" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "We have used some of the functions in Python, Numpy and Matplotlib. But what if we wanted to create our own functions? It is very easy to do so in Python. There are two ways to define functions. Let's use them to define the CRRA utility function $u(c)=\\frac{c^{1-\\sigma}-1}{1-\\sigma}$ and the production function $f(k)=Ak^\\alpha$.\n", "\n", "The first method is as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def u(c, sigma):\n", " '''This function returns the value of utility when the CRRA\n", " coefficient is sigma. I.e. \n", " u(c,sigma)=(c**(1-sigma)-1)/(1-sigma) if sigma!=1 \n", " and \n", " u(c,sigma)=ln(c) if sigma==1\n", " Usage: u(c,sigma)\n", " '''\n", " if sigma!=1:\n", " u = (c**(1-sigma) - 1) / (1-sigma)\n", " else:\n", " u = np.log(c)\n", " return u" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "This defined the utility function. Let's plot it for $0< c\\le5$ and $\\sigma\\in\\{0.5,1,1.5\\}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create vector\n", "c = np.linspace(0.1, 5, 100)\n", "# Evaluate utilities for different CRRA parameters\n", "u1 = u(c, .5)\n", "u2 = u(c, 1)\n", "u3 = u(c, 1.5)\n", "# Plot\n", "plt.plot(c, u1, label=r'$\\sigma=.5$')\n", "plt.plot(c, u2, label=r'$\\sigma=1$')\n", "plt.plot(c, u3, label=r'$\\sigma=1.5$')\n", "plt.xlabel(r'$c_t$')\n", "plt.ylabel(r'$u(c_t)$')\n", "plt.title('CRRA Utility function')\n", "plt.legend(loc=4)\n", "plt.savefig('./CRRA.jpg', dpi=150)\n", "plt.savefig('./CRRA.pdf', dpi=150)\n", "plt.savefig('./CRRA.png', dpi=150)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "While this is nice, it requires us to always have to put a value for the CRRA coefficient. Furthermore, we need to remember if $c$ is the first or second argument. Since we tend to use log-utilities a lot, let us change the definition of the utility function so that it has a default value for $\\sigma$ equal to 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def u(c, sigma=1):\n", " '''This function returns the value of utility when the CRRA\n", " coefficient is sigma. I.e. \n", " u(c,sigma)=(c**(1-sigma)-1)/(1-sigma) if sigma!=1 \n", " and \n", " u(c,sigma)=ln(c) if sigma==1\n", " Usage: u(c,sigma=value), where sigma=1 is the default \n", " '''\n", " if sigma!=1:\n", " u = (c**(1-sigma) - 1) / (1-sigma)\n", " else:\n", " u = np.log(c)\n", " return u" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sigma1 = .25\n", "sigma3 = 1.25\n", "u1 = u(c, sigma=sigma1)\n", "u2 = u(c)\n", "u3 = u(c, sigma=sigma3)\n", "plt.plot(c, u1, label=r'$\\sigma='+str(sigma1)+'$')\n", "plt.plot(c, u2, label=r'$\\sigma=1$')\n", "plt.plot(c, u3, label=r'$\\sigma='+str(sigma3)+'$')\n", "plt.xlabel(r'$c_t$')\n", "plt.ylabel(r'$u(c_t)$')\n", "plt.title('CRRA Utility function')\n", "plt.legend(loc=4)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "### Exercise\n", "\n", "Write the function for the Cobb-Douglas production function. Can you generalize it so that we can use it for aggregate, per capita, and per efficiency units without having to write a function for each?\n", "\n", "Remember aggregate production is \n", "$$\n", "Y = F(K, AL) = K^\\alpha (A L)^{1-\\alpha},\n", "$$\n", "\n", "per capita is\n", "\n", "$$\n", "\\hat y=\\frac{F(K, AL)}{L} = \\frac{K^\\alpha (A L)^{1-\\alpha}}{L} = Ak^\\alpha,\n", "$$\n", "\n", "and per effective worker\n", "\n", "$$\n", "y = \\frac{F(K, AL)}{AL} = \\frac{K^\\alpha (A L)^{1-\\alpha}}{AL} = k^\\alpha,\n", "$$\n", "\n", "where $k=K/AL$." ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "The second method is to use the `lambda` notation, which allows you to define functions in one line or without giving the function a name." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "squared = lambda x: x**2\n", "squared(2)" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "# Our first script" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let's write a script that prints \"Hello World!\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%file?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%file helloworld.py\n", "#!/usr/bin/env python\n", "# coding=utf-8\n", "'''\n", "My First script in Python\n", "\n", "Author: Me\n", "E-mail: me@me.com\n", "Website: http://me.com\n", "GitHub: https://github.com/me\n", "Date: Today\n", "\n", "This code computes Random Walks and graphs them\n", "'''\n", "\n", "'''\n", "from __future__ import division\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "'''\n", "print('Hello World!')\n" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "Let's run that script" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%run helloworld.py" ] }, { "cell_type": "markdown", "metadata": { "user_expressions": [] }, "source": [ "#### Exercise\n", "\n", "Write a simple script ``randomwalk.py`` that simulates and plots random walks. In particular, create a function `randomwalk(x0, T, mu, sigma)` that simulates the random walk starting at $x_0=x0$ until $t=T$ where the shock is distributed $\\mathcal{N}(\\mu,\\sigma^2)$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import randomwalk as rw" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rw.randomwalk(0, 500, 0, 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from randomwalk import randomwalk" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "randomwalk??" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "time.sleep(10)\n", "print(\"It's time\")" ] }, { "cell_type": "markdown", "metadata": { "tags": [], "user_expressions": [] }, "source": [ "Notebook written by [Ömer Özak](http://omerozak.com) for his Ph.D. students in Economics at [Southern Methodist University](http.www.smu.edu). Feel free to use, distribute, or contribute." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }