{ "cells": [ { "cell_type": "markdown", "id": "59db57ac-d174-4c88-b395-2204273a30f1", "metadata": {}, "source": [ "# Using Python for learning statistics Part 1" ] }, { "cell_type": "markdown", "id": "282bf6aa-0555-4852-9a6b-1b58eb935b1c", "metadata": {}, "source": [ "This Juppyter notebook contains the code examples form the blog post [Python coding skills for statistics Part 1](https://docs.google.com/document/d/16WJnYeezBevUBvsYpbklW04ukvrtG_8QVdfsbd15pqg/edit).\n", "\n", "I've intentionally left empty code cells throughout the notebook,\n", "which you can use to try some Python commands on your own.\n", "For example,\n", "you can copy-paste some of the commands in previous cells,\n", "modify them and run to see what happens.\n", "Try to break things, that's the best way to learn!\n", "\n", "**To run a code cell, press** the play button in the menu bar, or use the keyboard shortcut **SHIFT+ENTER**." ] }, { "cell_type": "code", "execution_count": null, "id": "ed5832a3-6014-47df-96f9-f0d969c94077", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "3f689330-7def-4935-be79-520d1c0190b6", "metadata": {}, "source": [ "## What can python do for you?" ] }, { "cell_type": "markdown", "id": "7bf445c3-2d7c-48ac-90be-84417d51f685", "metadata": {}, "source": [ "### Using Python as a calculator" ] }, { "cell_type": "code", "execution_count": 1, "id": "6ff440f2-f2c4-4821-8af0-1bec72c2b608", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "5.5" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2.1 + 3.4" ] }, { "cell_type": "code", "execution_count": 2, "id": "a5d156a0-ac2f-4271-95e6-1f22f499b535", "metadata": {}, "outputs": [], "source": [ "num1 = 2.1" ] }, { "cell_type": "code", "execution_count": 3, "id": "1d87588a-e92e-4225-9ace-dbe533c4f4f6", "metadata": {}, "outputs": [], "source": [ "num2 = 3.4" ] }, { "cell_type": "code", "execution_count": 4, "id": "a7eaa89b-cf12-477c-94b6-30641373ea55", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5.5" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "num1 + num2" ] }, { "cell_type": "code", "execution_count": null, "id": "27790060-6a22-4f9f-93e4-b85ff7273287", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "678eddd7-8f7e-483b-ab28-eb28db48c6bd", "metadata": {}, "source": [ "Let's now compute the avarage of the numbers num1 and num2." ] }, { "cell_type": "code", "execution_count": 5, "id": "305c829d-680e-4652-8399-2f55b40bee57", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "2.75" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(num1 + num2) / 2" ] }, { "cell_type": "code", "execution_count": null, "id": "08222c31-5369-4801-b65b-4733726f534b", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "d84a4c83-e628-4d28-a559-883c3af78a13", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "a63a1efe-8635-45a9-8beb-9c9d05fad193", "metadata": {}, "source": [ "### Powerful primitives and builtin functions" ] }, { "cell_type": "code", "execution_count": 6, "id": "57b9fab6-d2bd-4ebf-a191-eb83f81d6a0b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "75.0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades = [80, 90, 70, 60]\n", "avg = sum(grades) / len(grades)\n", "avg" ] }, { "cell_type": "code", "execution_count": null, "id": "1e3e7128-4902-4ee8-b83b-0e2bbfc31764", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "719d9567-6585-4794-89ba-e314cc8e148e", "metadata": {}, "source": [ "### For loops" ] }, { "cell_type": "code", "execution_count": 7, "id": "e097b267-e125-4f0f-bb06-5e6ef837fb48", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "75.0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "total = 0\n", "for grade in grades:\n", " total = total + grade\n", "avg = total / len(grades)\n", "avg" ] }, { "cell_type": "code", "execution_count": null, "id": "31dca516-7c74-42a8-9b93-44098a0249df", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "960b0808-0fff-47eb-9d7e-9069d3074c14", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Functions" ] }, { "cell_type": "markdown", "id": "92fff3a7-8337-4f7a-945d-80b93292f18e", "metadata": {}, "source": [ "Python functions are ..." ] }, { "cell_type": "markdown", "id": "31e5063f-59da-423e-a7ba-60f9e1906af3", "metadata": {}, "source": [ "To **define** the Python function,\n", "we use the def keyword followed by the function name,\n", "then we then specify the function input in parentheses,\n", "and end with the symbol :,\n", "which tells us \"body\" of the function is about to start.\n", "The function body is a four-spaces-indented code block that specifies all the\n", "calculations the function performs,\n", "and ends with a return statement for the output of the function.\n", "\n", "\n", "def ():\n", " \n", " \n", " \n", " return \n", "" ] }, { "cell_type": "markdown", "id": "f5142273-ad49-459a-8b55-32ffccee1db1", "metadata": {}, "source": [ "#### Example 1: sample mean\n", "\n", "We want to define a Python function mean that computes the mean from a given sample (a list of values).\n", "\n", "The mathematical definition of the mean is $\\mathbf{Mean}(\\mathbf{x}) = \\frac{1}{n} \\sum_{i=1}^{i=n} x_i$,\n", "where $\\mathbf{x} = [x_1, x_2, x_3, \\ldots, x_n]$ is a sample of size $n$ (a list of values).\n", "\n", "The code for the function is as follows:" ] }, { "cell_type": "code", "execution_count": 8, "id": "ab6e7d58-79f9-4afd-9f85-b814b45eb1bb", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "def mean(values):\n", " total = 0\n", " for value in values:\n", " total = total + value\n", " avg = total / len(values)\n", " return avg" ] }, { "cell_type": "markdown", "id": "346a10de-ad20-425a-b65b-c471874a7d3e", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "To **call** the function mean with input grades, we use the Python code mean(grades)." ] }, { "cell_type": "code", "execution_count": 9, "id": "ab56d567-0758-4b70-9b97-fb2cd1283deb", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "75.0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades = [80, 90, 70, 60]\n", "mean(grades)" ] }, { "cell_type": "code", "execution_count": null, "id": "bf781976-47b4-4a89-9f99-fdda0c48dd8a", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "795538a1-fa68-43c8-b8ce-461d67d57842", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Exmample 2: math function (bonus topic)\n", "\n", "In math, \n", "a function is a mapping from input values (usually denoted x) to output values (usually denoted y).\n", "Consider the mapping that doubles the input and adds five to it,\n", "which we can express as the math function $f(x) = 2x+5$.\n", "For any input $x$,\n", "the output of the function $f$ is denoted $f(x)$ and is equal to $2x+5$.\n", "For example, $f(3)$ describes the output of the function when the input is $x=3$,\n", "and it is equal to $2(3)+5 = 6 + 5 = 11$.\n", "The Python equivalent of the math function $f(x) = 2x+5$ is shown below. " ] }, { "cell_type": "code", "execution_count": 10, "id": "00fd3024-4329-4899-af02-2023a5e722f1", "metadata": {}, "outputs": [], "source": [ "def f(x):\n", " y = 2*x + 5\n", " return y" ] }, { "cell_type": "markdown", "id": "87abe9ef-97b1-4744-816b-7619539eff4e", "metadata": {}, "source": [ "To **call** the function f with input x, we simply writhe f(x) in Python,\n", "which is the same as the math notation we use for \"evaluate the function at the value x.\"" ] }, { "cell_type": "code", "execution_count": 11, "id": "88d9656c-913f-4617-999b-d98460ec070d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f(3)" ] }, { "cell_type": "code", "execution_count": null, "id": "5c4a84ba-e05d-4714-a525-8a2640147321", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "2483aa87-3da4-4b21-9f1c-be9899ba851f", "metadata": {}, "source": [ "## Why do you need coding for statistics?" ] }, { "cell_type": "markdown", "id": "3fdcf50f-49c5-4368-a924-ce53289cf298", "metadata": {}, "source": [ "### Data visualization" ] }, { "cell_type": "code", "execution_count": 12, "id": "a70a4bc4-50a7-4b50-8706-b2295bf57eb4", "metadata": {}, "outputs": [], "source": [ "prices = [11.8, 10, 11, 8.6, 8.3, 9.4, 8, 6.8, 8.5]" ] }, { "cell_type": "code", "execution_count": 13, "id": "5891ee22-6a6c-416d-a1bb-b283bb6a6519", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAg0AAAGdCAYAAACRlkBKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAPjUlEQVR4nO3dTYidhb3H8f9JxElIZ1IMdF7qxCbR1iKieBdF6aLFEBGxLaWtihdjQ+FCs9CNaJHUhRWJShfWt40EiS/oIrpwE9SmLYKo3KhIkZiUkIamyUaTMyb40sxzF70Zmirjb8bjOTNPPx84C5/hPM///M/B+ebkTKbTNE1TAACfY8mgBwAAFgfRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQOSM+d5xenq6Dh06VMPDw9XpdHo5EwDwJWmapqampmpiYqKWLJnbewfzjoZDhw7V5OTkfO8OAAzQwYMH6+yzz57TfeYdDcPDwzMXHRkZme9pAIA+6na7NTk5OfN9fC7mHQ2n/kpiZGRENADAIjOfjxb4ICQAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBARDQAABHRAABERAMAEBENAEBENAAAEdEAAEREAwAQEQ0AQEQ0AAAR0QAAREQDABARDQBA5IxBD/Cvdv75cD20a1+9e+SD+uboV+qX3z+3rrhgbNBjMYt+Pme9ulZyHq/F2bV1P219XAuNPc9uIe+n0zRNM587drvdWrlyZR07dqxGRka+8CA7/3y4/mf7/54+XKfqkf/+rwWzLE7Xz+esV9dKzuO1OLu27qetj2uhsefZ9WM/X+T7d/zXEx999FF1u93Tbr300K59nzrWNFUP/eEvPb0OvdPP56xX10rO47U4u7bup62Pa6Gx59kt9P3E0XD33XfXypUrZ26Tk5M9HeTdIx985vG9R6Z6eh16p5/PWa+ulZzHa3F2bd1PWx/XQmPPs1vo+4mj4Ve/+lUdO3Zs5nbw4MGeDvLN0a985vHzRod7eh16p5/PWa+ulZzHa3F2bd1PWx/XQmPPs1vo+4mjYWhoqEZGRk679dIvv39udTqnH+t0qjZ/b11Pr0Pv9PM569W1kvN4Lc6urftp6+NaaOx5dgt9Pwvmg5BV//+J0T/8pfYemarzRodr8/fW1QYfjFnQ+vmc9epayXm8FmfX1v209XEtNPY8uy97P1/k+/eCigYA4MvVl5+eAAD+s4kGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACKiAQCIiAYAICIaAICIaAAAIqIBAIiIBgAgIhoAgMgZ871j0zRVVdXtdns2DADw5Tr1ffvU9/G5mHc0TE1NVVXV5OTkfE8BAAzI1NRUrVy5ck736TTzSY2qmp6erkOHDtXw8HB1Op35nOJL1e12a3Jysg4ePFgjIyODHqfV7Lo/7Lk/7Lk/7Lk/PmvPTdPU1NRUTUxM1JIlc/uUwrzfaViyZEmdffbZ871734yMjHhB9old94c994c994c998e/73mu7zCc4oOQAEBENAAAkdZGw9DQUN1xxx01NDQ06FFaz677w577w577w577o9d7nvcHIQGA/yytfacBAOgt0QAAREQDABARDQBApHXR8I1vfKM6nc6nbps3bx70aK1y8uTJ2rJlS61Zs6aWL19e69atqzvvvHNe/5Y5s5uamqqbb765zjnnnFq+fHlddtll9frrrw96rEXvT3/6U1199dU1MTFRnU6nnnvuudO+3jRN/frXv67x8fFavnx5rV+/vvbu3TuYYRexz9vzjh07asOGDbVq1arqdDr15ptvDmTOxW62PX/yySd166231oUXXlgrVqyoiYmJuuGGG+rQoUNzvk7rouH111+vv//97zO3F154oaqqfvrTnw54snbZunVrPfzww/XAAw/UO++8U1u3bq177rmnfve73w16tNb5xS9+US+88EJt37693n777dqwYUOtX7++/va3vw16tEXt+PHjddFFF9WDDz74mV+/55576v77769HHnmkXn311VqxYkVdccUV9eGHH/Z50sXt8/Z8/Pjx+u53v1tbt27t82TtMtueT5w4Ubt3764tW7bU7t27a8eOHbVnz576wQ9+MPcLNS130003NevWrWump6cHPUqrXHXVVc2mTZtOO/bjH/+4uf766wc0UTudOHGiWbp0afP888+fdvySSy5pbr/99gFN1T5V1Tz77LMz/z09Pd2MjY01995778yxo0ePNkNDQ81TTz01gAnb4d/3/K/279/fVFXzxhtv9HWmNpptz6e89tprTVU1Bw4cmNO5W/dOw7/6+OOP6/HHH69NmzYtyF+qtZhddtll9dJLL9W7775bVVVvvfVWvfzyy3XllVcOeLJ2+cc//lEnT56sZcuWnXZ8+fLl9fLLLw9oqvbbv39/HT58uNavXz9zbOXKlfWd73ynXnnllQFOBr1x7Nix6nQ69dWvfnVO95v3L6xaDJ577rk6evRo3XjjjYMepXVuu+226na7df7559fSpUvr5MmTddddd9X1118/6NFaZXh4uC699NK6884769vf/naNjo7WU089Va+88kqde+65gx6vtQ4fPlxVVaOjo6cdHx0dnfkaLFYffvhh3XrrrXXdddfN+ZeFtfqdhkcffbSuvPLKmpiYGPQorfPMM8/UE088UU8++WTt3r27HnvssbrvvvvqscceG/RorbN9+/Zqmqa+/vWv19DQUN1///113XXXzflX2gJ88skn9bOf/ayapqmHH354zvdv7TsNBw4cqBdffLF27Ngx6FFa6ZZbbqnbbrutrr322qqquvDCC+vAgQN1991318aNGwc8XbusW7eu/vjHP9bx48er2+3W+Ph4XXPNNbV27dpBj9ZaY2NjVVV15MiRGh8fnzl+5MiRuvjiiwc0FXwxp4LhwIED9fvf/35ev5K8tX9U2bZtW33ta1+rq666atCjtNKJEyc+9SfdpUuX1vT09IAmar8VK1bU+Ph4vf/++7Vz58764Q9/OOiRWmvNmjU1NjZWL7300syxbrdbr776al166aUDnAzm51Qw7N27t1588cVatWrVvM7Tyncapqena9u2bbVx48Y644xWPsSBu/rqq+uuu+6q1atX1wUXXFBvvPFG/fa3v61NmzYNerTW2blzZzVNU9/61rdq3759dcstt9T5559fP//5zwc92qL2wQcf1L59+2b+e//+/fXmm2/WWWedVatXr66bb765fvOb39R5551Xa9asqS1bttTExET96Ec/GtzQi9Dn7fm9996rv/71rzP/ZsCePXuq6p/v9px6x4fPN9uex8fH6yc/+Unt3r27nn/++Tp58uTMZ3POOuusOvPMM/MLzfdHOhaynTt3NlXV7NmzZ9CjtFa3221uuummZvXq1c2yZcuatWvXNrfffnvz0UcfDXq01nn66aebtWvXNmeeeWYzNjbWbN68uTl69Oigx1r0du3a1VTVp24bN25smuafP3a5ZcuWZnR0tBkaGmouv/xy/0+Zh8/b87Zt2z7z63fcccdA515sZtvzqR9n/azbrl275nQdvxobAIi09jMNAEBviQYAICIaAICIaAAAIqIBAIiIBgAgIhoAgIhoAAAiogEAiIgGACAiGgCAiGgAACL/B/y6ZpnmPzipAAAAAElFTkSuQmCC\n", "text/plain": [ "