{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Chapter 4 - Under the Hood - Training a Digit Classifier\n", "> Deep Learning For Coders with fastai & Pytorch - Under the Hood - Training a Digit Classifier. In this notebook. I add some cells for utility fuctions. `path`, `ls`, `untar`, `!`, `tree` usage, as usual I followed both Jeremy Howard's Lesson and Weights and Biases reading group videos. Click `open in colab` button at the right side to view as notebook.\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- categories: [fastbook]\n", "- image: images/magpie.png" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](images/chapter-04/magpie.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> I found this little one in front of my window. Suffering from foot deformity and can't fly. Now fully recovered and back his/her family." ] }, { "cell_type": "code", "execution_count": 259, "metadata": {}, "outputs": [], "source": [ "#!pip install -Uqq fastbook\n", "import fastbook\n", "fastbook.setup_book()\n", "# below is for disabling Jedi autocomplete that doesn't work well.\n", "%config Completer.use_jedi = False\n" ] }, { "cell_type": "code", "execution_count": 260, "metadata": {}, "outputs": [], "source": [ "from fastai.vision.all import *\n", "from fastbook import *\n", "\n", "matplotlib.rc('image', cmap='Greys')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## EXPLORING THE DATASET" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What untar does?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: 'untar_data' come from fastai library, it downloads the data and untar it if it didn't already and returns the destination folder." ] }, { "cell_type": "code", "execution_count": 261, "metadata": {}, "outputs": [], "source": [ "path = untar_data(URLs.MNIST_SAMPLE)" ] }, { "cell_type": "code", "execution_count": 262, "metadata": {}, "outputs": [], "source": [ "??untar_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Tip: Check it with '??'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is path ?" ] }, { "cell_type": "code", "execution_count": 263, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Path('.')" ] }, "execution_count": 263, "metadata": {}, "output_type": "execute_result" } ], "source": [ "path" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Tip: what is inside the current folder? this where the jupyter notebook works. '!' at the beginning means the command works on the terminal." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is !ls ?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: __ls works on the terminal. (-d for only listing directories)__" ] }, { "cell_type": "code", "execution_count": 264, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2020-02-20-test.ipynb\t ghtop_images my_icons\r\n", "2021-07-16-chapter-4.ipynb images\t README.md\r\n" ] } ], "source": [ "!ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__can be used like this too.__" ] }, { "cell_type": "code", "execution_count": 265, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/niyazi/.fastai/data/mnist_sample/train\r\n" ] } ], "source": [ "!ls /home/niyazi/.fastai/data/mnist_sample/train -d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__also like this:__" ] }, { "cell_type": "code", "execution_count": 266, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/niyazi/.fastai/data/mnist_sample/train/3\r\n" ] } ], "source": [ "!ls /home/niyazi/.fastai/data/mnist_sample/train/3 -d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is tree ?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: __for seeing tree sturucture of the files and folders (-d argument for directories)__" ] }, { "cell_type": "code", "execution_count": 267, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[01;34m/home/niyazi/.fastai/data/mnist_sample/\u001b[00m\r\n", "├── \u001b[01;34mtrain\u001b[00m\r\n", "│   ├── \u001b[01;34m3\u001b[00m\r\n", "│   └── \u001b[01;34m7\u001b[00m\r\n", "└── \u001b[01;34mvalid\u001b[00m\r\n", " ├── \u001b[01;34m3\u001b[00m\r\n", " └── \u001b[01;34m7\u001b[00m\r\n", "\r\n", "6 directories\r\n" ] } ], "source": [ "!tree /home/niyazi/.fastai/data/mnist_sample/ -d" ] }, { "cell_type": "code", "execution_count": 268, "metadata": {}, "outputs": [], "source": [ "Path.BASE_PATH = path" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is ls() ?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: 'ls' is method by fastai similiar the Python's list fuction but more powerful." ] }, { "cell_type": "code", "execution_count": 269, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(#3) [Path('labels.csv'),Path('train'),Path('valid')]" ] }, "execution_count": 269, "metadata": {}, "output_type": "execute_result" } ], "source": [ "path.ls()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: Check this usage:" ] }, { "cell_type": "code", "execution_count": 270, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Path('train')" ] }, "execution_count": 270, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(path/'train')" ] }, { "cell_type": "code", "execution_count": 271, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(#2) [Path('train/7'),Path('train/3')]" ] }, "execution_count": 271, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(path/'train').ls()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: there are two folders under training folder" ] }, { "cell_type": "code", "execution_count": 272, "metadata": {}, "outputs": [], "source": [ "threes = (path/'train'/'3').ls().sorted()\n", "sevens = (path/'train'/'7').ls().sorted()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: this code returns and ordered list of paths" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is PIL ? (Python Image Library)" ] }, { "cell_type": "code", "execution_count": 273, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PIL.PngImagePlugin.PngImageFile" ] }, "execution_count": 273, "metadata": {}, "output_type": "execute_result" } ], "source": [ "im3_path = threes[1]\n", "im3 = Image.open(im3_path)\n", "type(im3)\n", "#im3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### NumPy array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The 4:10 indicates we requested the rows from index 4 (included) to 10 (not included) and the same for the columns. NumPy indexes from top to bottom and left to right, so this section is located in the top-left corner of the image. " ] }, { "cell_type": "code", "execution_count": 274, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 0, 0, 0, 0, 0],\n", " [ 0, 0, 0, 0, 0, 29],\n", " [ 0, 0, 0, 48, 166, 224],\n", " [ 0, 93, 244, 249, 253, 187],\n", " [ 0, 107, 253, 253, 230, 48],\n", " [ 0, 3, 20, 20, 15, 0]], dtype=uint8)" ] }, "execution_count": 274, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array(im3)[4:10,4:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: this is how it looks some part of the image in the NumPy array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pytorch tensor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the same thing as a PyTorch tensor:\n" ] }, { "cell_type": "code", "execution_count": 275, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[ 0, 0, 0, 0, 0, 0],\n", " [ 0, 0, 0, 0, 0, 29],\n", " [ 0, 0, 0, 48, 166, 224],\n", " [ 0, 93, 244, 249, 253, 187],\n", " [ 0, 107, 253, 253, 230, 48],\n", " [ 0, 3, 20, 20, 15, 0]], dtype=torch.uint8)" ] }, "execution_count": 275, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tensor(im3)[4:10,4:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: It is possible to convert it to a tansor as well." ] }, { "cell_type": "code", "execution_count": 276, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
0000000000000000000
1000002915019525425525417619315096000
200048166224253253234196253253253253233000
309324424925318746108410194253253233000
401072532532304800000192253253156000
503202015000004322425324574000
600000000002492532451260000
700000001410122325324812400000
800000111662392532532531873000000
90000016248250253253253253232213111200
100000000439898208253253253253187220
" ], "text/plain": [ "" ] }, "execution_count": 276, "metadata": {}, "output_type": "execute_result" } ], "source": [ "im3_t = tensor(im3)\n", "df = pd.DataFrame(im3_t[4:15,4:22])\n", "df.style.set_properties(**{'font-size':'6pt'}).background_gradient('OrRd')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## BASELINE: Pixel similarity" ] }, { "cell_type": "code", "execution_count": 277, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(6131, 6265)" ] }, "execution_count": 277, "metadata": {}, "output_type": "execute_result" } ], "source": [ "seven_tensors = [tensor(Image.open(o)) for o in sevens]\n", "three_tensors = [tensor(Image.open(o)) for o in threes]\n", "len(three_tensors),len(seven_tensors)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: 'sevens' are still list of paths. 'o' is a path in the list, then with the list comprehension we use the path to read the image, then cast the image into tensor.(Same for threes). 'seven_tensor' is a list of tensors" ] }, { "cell_type": "code", "execution_count": 278, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAEQAAABECAYAAAA4E5OyAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAIEElEQVR4nO2bS08TbRuArx6ntJQONBQoYIFgEPCQYBQ0Rl2YuDFGoy504Q/xJ7j1B7hwR+LGjboQSZRo8IQBLIdWC5aDI1Baep525luYzgulKF++TjHv1yvpZmae9u41z9z3c2gNqqpS5R+MBx3A30ZVSBFVIUVUhRRRFVKE+Q/n/80lyFDqYLWHFFEVUkRVSBFVIUVUhRRRFVLEn8pu2VEUhXw+Tz6fR5ZlMpkMqVQKs9mMxWLBbDZjMpm06wVBwGKxYDCUrJJlp+JCstksW1tbrK6uMjs7y/v373nz5g0+nw+v16u9Cpw5c4bm5maMRmNFpFRMiCzLpNNp1tfX+fbtG9++fWN+fp5AIEAoFCKbzZJMJolEIqysrABgNBrp7u6mvr4eq9WK2ax/uBUTEo1GmZiY4MWLFzx69IhUKkUymSSXy5HP51laWuL9+/cYjUaMxl+pzWAwIIoiDoeDlpYWamtrdY+zYkIsFgv19fXYbDaSySTpdJp0Oq2dz+fzJdvNzc3x7t07hoaGMJlMWp7RDVVVf/cqG7Isq4lEQh0eHlZbWlpUp9Op8muu9NuXy+VS29vb1QcPHqh+v1+Nx+PlCqnkd65Y2TUajZjNZlpaWjh9+jRtbW2YTCYtUZrNZux2+667n0ql2Nzc5MePH0iShCzL+sap67tv/yCjEavVSldXF3fv3uXixYsIgqAJqK2tpampCYfDsaNdNpslHo8TDAYZHx8nHo/rG6eu716C2tpaenp6aG9vx+VyIQgCACaTCUEQdoxBtuPxeDhy5Ag1NTW6xldxIU6nk76+PoaGhvD5fLhcLgCsVit1dXVYLJZdbQwGA8ePH+fs2bO6V5qKD8wKOcPtdjM0NIQoilq1kSRpR+U5CCoupIDX6+XWrVuMjIwQi8UIhUKEQqE9ry8M+fXmwCZ3NTU1tLe3c/78eW7cuMG5c+dwu90lc4SqqgQCAfx+P6lUSte4DqyHOBwOHA4HTU1NDAwM4Ha7CYVCLC4u7vrSqqoyPj7O1tYWXq8XURR1i+vAhKTTaeLxOIlEgmg0SiAQYHV1lUQisetag8GA2+2mpaUFq9Wqa1wHJmRrawu/38/y8jLhcJhPnz7x/ft31D32mj0eDz09PbqX3YoJyeVyyLLM2toac3NzzM/PMzMzQywWIxqNMjc3t6cMgKamJrq7u7HZbLrGWVEh8XicsbEx7t+/jyRJ/PjxA0VRUBTlt20NBgM+n4/Ozk5tIKcXFasy+XyeWCyGJEmsra2RSCRQFOW3vaKAqqrMz8/z5cuXf0+VyeVybGxsIEkSP3/+RJblP/aM7czNzWGz2Whra9NGt3pQMSGCINDR0cGFCxcIh8P4/X4+fvy4r0cGIJlMEovFdB+cVUyIzWbDZrPR39/PlStXEASBycnJffUUVVVJJBLE43FyuZyucVa87IqiyODgIKIo0tDQQDabJZvNaudVVUVRFMbHx5menkaW5YoM2QtUXIjdbsdut+N0OmlsbCSTyZDJZLTzBSHpdJpgMKhtWVSKAxuY1dTU4PP5Sk7aFEXh4sWLKIrC69evCQaDrK6u4nA42NjYIJ1OY7FY9lw7+V84MCGCIOw5plBVlRMnThCLxVhYWCAYDLK5ucni4iKxWIxMJoPJZNJFyF+3lVl4ZKampnj8+DEzMzPAP2uyem9YHVgP2YuCkEAgwOjoqHa8sAXxfydkdXWVYDDI/Py8dsxgMDAwMMCxY8doa2vDZrPp8riADkIK+xvb7+J/c0clSWJsbIzv37/vOO71eunv70cUxZLrruWibEIURdGG569evaKhoYGOjg5EUcTtdv+xfS6XI5fLMT09zdOnT3f1kK6uLk6ePIndbi9XyCUpW1JVVRVZlvn58yfPnj1jdHSUUChEJBLZcxK3fccsl8uRyWRYXFzk8+fPrK+vA79kmEwmmpqaaGxs1LV3QBl7yNbWFs+fP2dycpKXL1/idrtZWlri1KlTXL9+HavVitVqxWKxIAgC6XSaVCqlbXr7/X4+fPjAq1evtJkwwODgIL29vfT19emaOwqUTUgqlWJiYoJAIEA4HCYSiZBOp3E6nUiSpK2hOhwOzGYz2WyWSCRCNBolEonw9u1bRkZGCIVC2nzFaDTS2dnJsWPHaGho0MqunpRNiNPp5OrVq3z69InFxUXW1tZYWFjgyZMnzM7OarmksDYaDof5+vUr8XicaDTK8vIy6+vr2r5MQ0MDoihy6dIlLl++TH19/Y69YL0omxCz2UxrayvxeJy2tjby+TwrKyuEw2FCoRC1tbV4PB6am5s5dOgQX79+xe/3k8lkdkzu4FfecLlc+Hw+Dh8+jMfjqYgMAMMfVqz2/dNuRVHIZrMsLS3x8OFDVFXFarXi9/sZHh7WNrutVqv2G5FkMrkr4dpsNgRB4N69e9y8eROPx4PdbsdgMJRbSMk3K1sPMRqN2Gw26urqaG1tRRAEvF4vsixjt9u1JJnJZDQRpWaxdrsdURQ5ceIEXV1d5Qpv35R9YCaKIrdv3wZ+df3CD+YKQjY3N5EkCb/fz9TUlNauUH3u3LnDtWvXOHr0aLlD2xdlF2KxWBBFUZuTNDc3MzAwoPUGSZJwuVwoioIkSVq7worakSNH6O3txel0lju0fVG2HFKy8bYBV+FzCo9KJpPZsdNfyBGiKFJTU1OJElsyh+gq5C+n+n+Z/VAVUkRVSBFVIUVUhRRRFVLEnwZmlfmTyl9EtYcUURVSRFVIEVUhRVSFFFEVUsR/AP0FXN1zCRLUAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "show_image(three_tensors[0]);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Tip: Show image shows the first tensor as image" ] }, { "cell_type": "code", "execution_count": 279, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 279, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAEQAAABECAYAAAA4E5OyAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAJHElEQVR4nO2bXXMSZxuAL1jYXYQsiCYxIWIIjImJ0TaVTu2H41FnnGmPPOtMf0NP+i/6H9oDx+OOOtMjW6cdGz/Sk1ZqxkRDhCTQEAjfsLDsvgeWbbMmxgrEzDtcR5ln+bi5eJ5n7/t+iM0wDPr8g/1tB3DY6Aux0BdioS/EQl+IBcc+1/+fb0G23Qb7M8RCX4iFvhALfSEW+kIs9IVY6Aux0BdiYb/ErGMMw6DVatFsNqlUKmiaRrPZpNls0mg0Xnq8LMtIkoSu6+i6jtPpxOFw4HQ6EQQBWZZxOHoXds+FaJpGvV4nHo/zww8/kMlkSKVSxONxnjx5wr/7MTabjffee4+pqSmq1SqqqjIyMsKxY8eIRCIMDw8zOzuLz+frWbxdF6LrujkDisUilUqFbDbLn3/+yeLiIoVCgfX1dTKZDJVKBVEUEUWRer1OrVZjZWUFh8NhCqlUKmxublIulzl+/DgTExM9FWLbp2P2n2sZVVXZ3t5mdXWV77//nlQqxaNHj8jlcqTTaQzDwDAMZFnG7XYzODhIIBDgyZMnPH/+HLvdjt1uN2eOzWbDZrPh8/nwer1cv36daDT6hh93B7vWMh3PEE3TqNVq1Go1UqkUlUqFVCrFysoKT58+pVQqIQgCExMTRKNRnE4nkiQhiiIulwtFUfB6vczOzpLJZFheXubZs2eUy2VqtZr5Po1GA1VV0XW905BfScdCVFUlFovx22+/8c0331CpVGg2m7RaLRqNBoFAgGg0ysWLF7l69Soejwe3220+3263Y7PZ0HUdwzC4ceMG165dIxaLkUwmOw3vP9OxEMMwaDQaVKtVSqUS1WoVXdex2+2IokggEGBubo5z587h8/lwOp04nU7z+e0l8e+lJAgCNtvOGe33+wkGg8iy3GnIr6QrQqrVKrVaDVVV0TQNAFEUOXr0KHNzc3zxxRf4fD4URXnpg7ZpjzudTux2+0vXpqenmZmZQVGUTkN+JR0LcTqdhEIhRFEkn8/TbDaBF0Lcbjdzc3N4vV5EUdxTBrzYizRNI5/Pk8/nzRzFZrMhCALDw8OEw2FcLlenIb+SjoVIksTp06cJh8N88MEH5nh7KQiCgCiK+75Oo9GgVCqxtrZGMpmkWq0CmM+PRCJEo9Ed+08v6FhI+1sXBGHH3tC+Zp3+e7G+vs7PP//M77//TqlUQtM0BEEgHA4zPj7OzMwMJ06ceOk9uk1XErP2bHidmbAXd+7c4auvvkLTNHRdx+FwIIoiFy9eJBqN8u6773LixIluhPtKep66WzEMA13XqdfrlMtlM2FbWFgwN2S73U4kEiESifDhhx8SjUZ7vpm2OXAhuq6jaRrZbJbHjx/z448/cvPmTbLZrHm7djgcXLhwgY8//phPP/2UkydPHlh8B1LtGoZBsVhkdXXVrGWSySSrq6ssLS2Ry+Wo1+vAi3zD7/dz+vRpzp8/z8DAQK9D3MGBlf/JZJLvvvuO5eVlHjx4gKqqO1LzNkNDQ0xPT3PhwgUmJyd7fpu1ciBLRtd1CoUCjx49Yn19HVVVzXzFSiaTIRaLcevWLeLxOMeOHcPj8TA2NobX62VwcLCnkg5syWxtbfHgwQM0TTM31t3IZDJkMhnW19dxu90oioLH4+HKlSucP3+eS5cuIcvyK5O8Tuh6+f/SC/y9ZDY2Nrh9+zblcplCoUCxWCSXy5mPi8fjLC0tmT0UURRxOp1mv2RqaorR0VE++ugjzpw5w9mzZ/F6vQiC8Nq5joVdjfZciJVarWbKWFtbM8d/+uknfvnlF1ZXV0mn07s+t91Ri0QifP3110xOTiJJEoIgvEkovemH/FecTieKoiDL8o7O19DQEJcuXWJtbY10Ok0mkyGXy3H//n3i8TjwYrYlEgmq1SrPnz9ncHCQ48ePv6mQXTlwIQ6HA4fDgcvlwuv1muPDw8NMT09TrVbNW/TKygrZbNYUArC5uUkulyMejxMKhTh69Gh34+vqq3VAuxBsd9VlWSYYDBKPx/nrr79IJBJsb28DL2bKxsYG8XicU6dOdTWOQ3Mu0y4EJUkye63BYJDZ2VmmpqZ2pO6GYZgzR1XVrsZxaITsRbtwtI55vd6eVL+HXgjAbndCt9uN3+/v6oYKh2gPsVIsFtne3mZhYYGHDx/uyFlsNhunTp0iEol01HLYjUMppF0MJpNJEokEiURiR2YrCAJDQ0P4/f6uH2seOiHVapVKpcK9e/e4c+cOf/zxh3lEATA5Ocn4+DiBQABZlt80S92TQyOk/YHr9TrZbJZYLMb8/DypVMq8ZrfbCQaDRCIRvF4vDoej6zXNoRFSKBRIp9Pcvn2bu3fv8vjxY5LJpNknGRgYwO12c/XqVS5fvszo6Oiu5zed8laFtCthwzDI5/M8ffqUhw8fcuPGDbO3Ci820SNHjjA4OMi5c+cIhUI9kQFvUYiqqqiqysbGBktLS9y9e5f5+XkSiQTNZtNcJi6XC1mW+fLLL/nkk08Ih8M9kwFdFvLvb9yaULXH2383Gg0KhQLxeJz5+XkWFha4d++e+fj2me/AwACKonD27FneeecdPB5Pz2RAF4Vomka1WqVcLrO2toaiKIyMjJiH3pVKha2tLUqlEpubmywvL7O4uGjWKfl8HvgnM41EIoTDYa5cuUI0GiUUCqEoSk9/PQRdFNJqtSiVSmxtbRGLxRgdHUWSJPMXRLlcjuXlZba2tsxl0u6tqqq645RPFEXGx8cJh8NEo1FmZmbMhlGv6ZqQfD7Pt99+SzKZ5P79++Zhd6vVotVqmecwqqqaM6ZSqZgb5/DwMGNjY7z//vtmk/nkyZMoioIkSV3PN/aia0IajQaJRIJnz56xuLj4Wj9ssdlsSJKEJEmMjY0RiUQ4c+YM0WiUiYkJ/H5/t8J7bbomxOPxcPnyZTweD7/++uu+QmRZ5siRI3z++ed89tlnhEIhRkZGcLlcB7Y8dqNrQhwOB8FgkHQ6TSAQMI8l98LlcuHz+ZicnGR2dpahoaEdHbS3RdeazK1WyzxvKRaL+7/x3w0ht9ttdsm6XcrvF8KugwfddT9E9P+j6nXoC7HQF2KhL8TCfrfd3lVRh5T+DLHQF2KhL8RCX4iFvhALfSEW/gcMlBno19ugeQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "show_image(tensor(im3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: check this in more straight way (im3>tensor>image) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training Set: Stacking Tensors" ] }, { "cell_type": "code", "execution_count": 280, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([6131, 28, 28])" ] }, "execution_count": 280, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stacked_sevens = torch.stack(seven_tensors).float()/255\n", "stacked_threes = torch.stack(three_tensors).float()/255\n", "stacked_threes.shape" ] }, { "cell_type": "code", "execution_count": 281, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 281, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(stacked_sevens)" ] }, { "cell_type": "code", "execution_count": 282, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 282, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(stacked_sevens[0])" ] }, { "cell_type": "code", "execution_count": 283, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 283, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(seven_tensors)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: now we turn our list into a tensor size of ([6131, 28, 28])" ] }, { "cell_type": "code", "execution_count": 284, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 284, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(stacked_threes.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: This is rank (lenght of the shape)" ] }, { "cell_type": "code", "execution_count": 285, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 285, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stacked_threes.ndim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: This is more direct way to get it. (ndim)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mean of threes and sevens our ideal 3 and 7." ] }, { "cell_type": "code", "execution_count": 286, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAEQAAABECAYAAAA4E5OyAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAJtUlEQVR4nO1b2XLiWhJM7QsChDG22x3h//+qfnKzWVhoX5HmoaNqDufK9jRge2aCiiCEAS0nVUtWlqz0fY+r/dvU776A/za7AiLZFRDJroBIdgVEMv2D7/+fS5Ay9OHVQyS7AiLZFRDJroBI9lFS/RQ7tV1QlME8eFH7dEDkxdPf4udDAMmLVxQFfd8Pfn5JuyggQ4vs+/7o1XUdfy6+l01RFCiKAlX9E9WqqvJn9JJ/fwm7CCDy4ruu423XdTgcDjgcDqiqCm3boq5rtG2LpmlwOBzQti3vr6oqVFWFaZrQNA2WZUHXdViWBU3ToOs6NE3j3xFQZOcCcxYgQ15AIHRdh7Zt0bYtqqpCXdcoigJlWaIoCtR1jTzPUdc1yrLk4+i6DlVVMRqNYFnW0dY0Tdi2DcMwYBgGNE1jEGRgTrWTARHBkD2haRpUVYU8z1EUBcIwRJIk2Gw2CMMQu90OaZoiiiKUZYk8z3E4HNB1HUzThGEY8DwPo9EIi8UCs9kMj4+PmM1mmM/ncF0X4/EYlmUdeY4YYqeCc7aHDAFSliWqqkKSJMjzHLvdDmEYYrPZII5jBEGALMsQRRGyLEOWZRw6dOdnsxk8z0PXdSiKAoqioKoq6LqOw+EAXdfR9z17CYXPUOL9dEDkXCHmiKqqEMcx0jTFdrtFGIZ4fn7Gfr/HZrNBmqYIggBJkiCKIlRVhbIs0bYtDocDn8NxHFiWhYeHB9ze3iIMQ9zc3CDLMtze3qJtW0wmEyiKAsdxjpLvOXZ2UhXzBy1K3LZty+GlaRpM08RoNIKiKNA0jQFpmgZt27Knkaf0fY+6rlHXNYdhlmX8N53DNE2+ji/3EAJiKHfQhdZ1zVVEVVVYlsVxb9s270P7U/WhvNM0DZqmga7rnJgp76iqivl8Dk3TMB6Poes6uq7jkKEbcAowJ4eMaHRiimNd19kTyHNs24Zt27xQApRAITAJkKIokOc5NE07SpbyfqKHXsIuRszoonVdh+M4nOw8z4Pv++wB8j60QAJgv98jjmOuTMRZdP3PpYqe+R4Q31Jl6MQiGADQdR2Tp6ZpOESImYpGn2dZBl3XUVUViqJgPkILo5xD5zEM44ikib87x04CRD45ubVpmnyRXdfBcZyjaiTfTSJvTdPAsiyYpvmPUAFwlBNM02SCRpxlCLxT7SwPES9AVVVehBgKsnuLpZq25BVhGGK/32O/3yNNU+R5zhVLVVUYhnHEXonWEykb6nG+HBDxLhIQcqKTgaBS3Pc9V4+Xlxes12usViu8vLwgiiLs93tesKZpsG0bk8kEvu9jNBrBdV0GhRL6uXYyIDIQiqKg67pBUESeIvKJJEmYwa7Xa2y3WwRBgN1uhyzLkOc5JpMJl2rXdeF5HsbjMRzHOQpRsRv+FkBEUAgEIlJvtfvU4NHdXy6XWK1W7BUESBiGHIau60LTNDiOg8lkwpTecRw4jnPkHd+WQwgA8b28JRAOhwN3tHEcI4oiDo3lcsmhst1usdlsUBQFqqri5EkVi7jNUHW5VIU5GZD/BBSRxRZFgSRJsN1usVqt8OvXL6zXazw/P+P3799YLpeI4xhJkvAxPc+D53kA/lQxSqhi6y+LRnQt59jFRGbxQihcyDuKouCmbrPZYLPZIAgCLJdLBEGANE1R1zULRCLPAHBUiaiBpLZAZqvnMtazc8iQZirS67qukWUZ4jjGer1mQJbLJZbLJZIkQZIkXIYVRTnyAjpWVVUsFbiue9QMUh/zrSHznhFIIg+h5EpyoO/7WCwWGI1GmEwmDKSmaZxEKUwo7KIoQhAE3NRR9yxrr8A3UnfRZJFZJmZEvy3Lgud5uLu7Y3WNTNRLAcAwDHRdhzzPoaoqXl9foWkabm5uoOs6bNvm4367h7w3SlAUhXOB67po2xY/fvxghlkUBbIs49AiI/BkQbrve65UqqoiiiJomgbXdTnviJ5C1/C3dramKr8XL0am2/P5HI7jwHVdFn1kSk+9DVH3OI4ZOGK1qqpiv9/DsixUVcVeJHriqXaWHjK0FUsxjRNc14VhGDBNE3VdYz6fs2fIgJCC9vr6ijiOmXiRStY0DVet0WiEsixhmib3O8SWvyyHvFVVxO+o+pD7ip0pdbhDs5yu61igtm2bQ4tAAoC6rqFpGqv1ojInMmU69t8Cc5aHyG39EDAU3zRzeav5I05BJZd6HgLGMAxOvqJsQFLkECf58hwytDBxS6CQejaUa2QPIbNtm0uvyErl38tgnGt/BYjsGVQd3tM236PW4gLp+LRYyhUkWFOYDc13L8FQyU7OISIAlBxliVAeWL8Finx8keWSQCQn7KF9ZftS1V2Me3FoTQuStVZRURMBot9TcozjmGn+arXCbrfjkSclTmK7NBCn7ve9pwM+DRBZ6xABaZrmKBcQ42zbli9cFKMVRWFQReEoTVNW37MsQ1EURyFD9F7WU7+NqYqA0BCpbVu+8Kqq+Ht5VkPAkFG1oMZts9mwRvL6+ookSVAUBatjJBT5vo/pdMojTzr2uU3eSUlVBoW8g2a0RVFwDiAKL48OyGhARSradrvFbrdDEAQcKsRGiehRBSJuQ99dwlP+ChBZFBJPTqSqLEsEQcC0m7xIfIaDRo9Ex8uyRJqmSNOU5YA8z1GWJfMQz/MwnU5xf3+Pu7s73N3dYTKZwPM82LY9mEc+HRARGPmkYnIkgTiKIvYcAk0UpEV5kYbYSZLw4xF93zMpcxwHnudhMplgMpnAcRxmwEO66qn214CIYFBiI+2TdFAAR3khDEPUdY04jo9yDok8IuOkhOn7PsbjMR4fH+H7Pp6ennB/f4+npydMp1PMZjMGhfb58pAZAoUSpvwiLxCbsd1ux5WE8g6Vb3J3Yqeu68L3ffi+j/l8jsVigdvb26MwuVQiPRkQOinxCHERNL60bRsAMJ1Ooes69vs9DMNAkiQwDIPlRNI66O7SvGU+n2M8HuP+/h6z2Qw/f/7EYrHAzc0NC88iGEPc5ssAkcERgSHdAwCr5VmWQVGOH4Ui8IjIkUfNZjOMRiPMZjP4vs9PDj08PGA8HnPeoFmMqKxdcgyhfNADDH451NOQuEP6Z9M0XCnSNOVHrUg9pypDi6PRpLilZ0rkofZQiT0BjMEdzgJkiLVSmaXRgchPaEu5gzQTEotN02SSRS+x2x16NvUMr7gcIPzlAFED/tn9fjQ7kZO03JO81aOcGSKDO19ktiv/LbblQ9uPjvfW9q3zXtLO8pCP7BIaxScu/vIe8uEZP/FOfpZd/4FIsisgkn0UMv97Pn+mXT1Esisgkl0BkewKiGRXQCS7AiLZvwBtCZqwAvXF1QAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "mean3 = stacked_threes.mean(0)\n", "show_image(mean3);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: This is the mean of the all tensors through first axis. 'Ideal Three'" ] }, { "cell_type": "code", "execution_count": 287, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAEQAAABECAYAAAA4E5OyAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAI6klEQVR4nO1baVPiTBc9ZF/IhoOWOjUf5v//KgelNEZICCErvB+eutemjaNC8N04VanG7H1y99uOdrsdzniF8u9+gf80nAmRcCZEwpkQCWdCJGgfHP9fdkGjvp1nCZFwJkTCmRAJZ0IknAmRcCZEwpkQCWdCJJwJkfBRpHoQPlNjkc8ZjXoDxzf47HmHYhBCxMnR74/Gz0IkYDQa8d/y2Hf+ITiKEHmS2+2Wx91uxxv9TcfFY+L1BHGy9FtRFIxGo96RjsvXH4KDCBEnIk6aJt51HbbbLbquQ9d1aNuWR9pP58n3kUET1zQNqqrCMAyoqgrTNKGqKjRNg6IoexvhEGK+TIj44kQCEdA0DZqmQVmWaJoGm80GdV1jvV6jrmvkeY6qqrDZbNA0Deq6Rtu2TFSfJKmqCkVR4LouTNPExcUFXNdFFEVwHAe+78M0TViWxQSNRiOoqordbvdlUr5EiCwZ9KVJAoqiQNM0WK/XKMsSWZahKAokSYLNZoMsy1CWJRPVNA1fS6SKKgaACRmPxzBNE9PpFJ7n4devX/B9H5qmYbfbMRHb7RaKohxExpcI6VMPmgxNcLVaoSgKxHGMNE0xn8+RZRniOEae51gsFlitVlgulywpoiqJ6kQbffUgCOB5Hn7//o0wDJGmKS4vLwEAQRBA0/6ZCqnYyQnpI0ckhlSFJCHLMqRpymOe50iSBOv1GqvVCm3boq7rvfuJIHLatkVVVRiNRmiaBlEUQVEU5HkO13VZ0kjCjsWXVUYkous6NE2DqqpQliXyPEeapojjGMvlEnEcI8sy3N/fY7VaIUkS1HWNsiyZAEVR2DCS1xCfQXamrmvoug7TNFHXNS4vL2EYBvI8h23bLK3vGeeTEPI3kIskEdd1HbquwzAM+L4PVVUBgL+6fA55ETKyy+US6/UaWZZhvV7zMwDskSkSSb9Fd/1VfIqQjxgXyaAJmqYJ27aZhPF4jDAMoaoqu03HcfhcXdehqip7qiRJkGUZ7u7u8PT0xAabDCdBdrnHkPFpQshIyQQoisISsd1uYZomuq5DEARQFAV1XcO2bei6ziqg6zosy4Jt23BdF5ZlwbIslpDNZoOyLHl/nucoioLVgQik0bIsjk2+jRCRiD5CDMMAALiuC1VV0XUdTNOEoiioqgphGLKtoNjB8zy4rssTo3sSIZ7nYTabIc9zrNdrVFWFtm1hWRYcx4Ft27Btm4kjCRNV5tu8jKizAKDrOn89ALBtm41j13WoqgqapsE0Tbiuy5LhOA50XedYYrfb7UWmwD/qRt5IVVU4jgPXdTEejxEEAUuIaJiPwacJER+kKAoHQPTypNuqqrL6ULS42+1YVSzLYsmwLIuJJZUiIk3TZEKqquKYxHEcOI4Dz/M4WqUoVbQjJydEJIaCHjHxAv6RFABMhghR98muEJF0Pbnbuq6RZRkHcuRlDMPAeDyG7/sIw5CjV13X3+Qxh+JglQFeJYV0V9d1Jqxt2z2dFr0P6TtNgK6h2KYsS6RpisVigcViwUEYqVwQBIii6A0hx7rcLxMiehvRsJLEkOEkkogQ2i8TQRCDvDzP8fz8jMfHRzw/PyNNU9R1zaG77/sIggCu68K27T3SAey938mTO3oQPVj0OuRxSBrETJU2MVUX70PGl/Kh+XyOp6cnzGYzpGmKpmmgaRo8z4PneQjDkCWGpGMoHBypytICvNoSsh80igUdoL+EUBQF0jTF/f097u7uEMcx4jhG0zRQVRVhGGIymfBIrvZYFZFxVOjeV84Tv5ZMmLyfJKNtW2w2G86QHx4eMJ/PkaYpezFys7TJrnYoUg42qrItod8A9ryGDLmMQMlekiT48+cPZrMZ5vM5FosFmqZhb3J1dYXLy0tMp1OMx2OOP0TJG4KUoyVEtCVkYGkTiZOLS2RI67pm6Xh4eEAcx3h4eECe5+i6DoZhIIoihGGIi4sLNqhiZErvMgSOtiF9Ve/tdvsm/wH2SaGSY1EUeHl5wWw2w2w2w8vLC6uK53m4vb3Fz58/cX19jZubGwRBwEmhHIx9u9v9G+RIVm5NyG6RCktUR0mSBEmSYLFYoCxLqKoK27bx48cPXFxcYDKZYDKZwHEcmKb5xn70kfBtucxnH0xSIhdtKHCjuuv9/T3HHGVZQtM0RFGEIAhwfX2N29tb3NzcIIoi2LbNkXBfyn+sCg1iQ+S/33sZUWWoClYUBZbLJfI8R5Zl7GZ938d0OkUURZhMJvB9H7Zt73mX98g4BkdLSB8phPdUheqvaZri6ekJSZJwi4LynNvbW0ynU5YQMqaidPTZjW/Ldv8GedLv7SPVIemgEiGRQV5FzGajKILneRyIDW1EZQza7O7zLMCrVxGN6HK5xOPjI6vLdruF4zgIggC2bePq6gpXV1fchxErY3LkKz7/WJx8OURfy4J6Mnmec08HANdIKGchcvq8CtAfFB6LQSVEVg+5b0M9mcVigefnZywWC2w2G4xGI1aJyWSCKIrY1ZLdoJrr0KG6jJOtD/lbD2ez2aAoCpRlia7rOF/RdZ2Lz5TeExGniEr7cJL1IWJoTipSFAVWqxWyLMPLywvyPOe2AnkWUUJEdZELQKfE4CrTJx3Ua6mqipO5uq65qKxpGtsPalGIlfTvIgMYgJC+tSLUZyXpoJ7ver1mQ0pVNbHOats2wjCE7/uwLOvDmOMUGNyG9NkOMqo00nIHsTBMTScav8uIyhhsSdV7RpQWxtBGrQbTNN8siKEikOhqRYP6X6EyIvoW1IgrgyiUp8YUqYSqqiwdhmFwi2KIPstXMWhy99451AR3XZcnKa8CIBsitiffa1mI49A4SRwC7Lc7adLU6gReWw+apnGb0zAM3khy/lbzOAUpow++8KdWnvTZEHHJFdkSMqxt2+6pEEkRkUNumEj5qDJ2IDG9Fw1eMaNqmdinESdMC+xEQoDXxXWiIZXvcYp0/808hpAQPrknYhX391XPel+qJ4EbqiImPqZ355CE8EXvlADeO953ft/EB5aKgwj5v8P530MknAmRcCZEwpkQCWdCJJwJkfAv6ObhbeIGuNEAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "mean7 = stacked_sevens.mean(0)\n", "show_image(mean7);" ] }, { "cell_type": "code", "execution_count": 288, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAEQAAABECAYAAAA4E5OyAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAJHElEQVR4nO2bXXMSZxuAL1jYXYQsiCYxIWIIjImJ0TaVTu2H41FnnGmPPOtMf0NP+i/6H9oDx+OOOtMjW6cdGz/Sk1ZqxkRDhCTQEAjfsLDsvgeWbbMmxgrEzDtcR5ln+bi5eJ5n7/t+iM0wDPr8g/1tB3DY6Aux0BdioS/EQl+IBcc+1/+fb0G23Qb7M8RCX4iFvhALfSEW+kIs9IVY6Aux0BdiYb/ErGMMw6DVatFsNqlUKmiaRrPZpNls0mg0Xnq8LMtIkoSu6+i6jtPpxOFw4HQ6EQQBWZZxOHoXds+FaJpGvV4nHo/zww8/kMlkSKVSxONxnjx5wr/7MTabjffee4+pqSmq1SqqqjIyMsKxY8eIRCIMDw8zOzuLz+frWbxdF6LrujkDisUilUqFbDbLn3/+yeLiIoVCgfX1dTKZDJVKBVEUEUWRer1OrVZjZWUFh8NhCqlUKmxublIulzl+/DgTExM9FWLbp2P2n2sZVVXZ3t5mdXWV77//nlQqxaNHj8jlcqTTaQzDwDAMZFnG7XYzODhIIBDgyZMnPH/+HLvdjt1uN2eOzWbDZrPh8/nwer1cv36daDT6hh93B7vWMh3PEE3TqNVq1Go1UqkUlUqFVCrFysoKT58+pVQqIQgCExMTRKNRnE4nkiQhiiIulwtFUfB6vczOzpLJZFheXubZs2eUy2VqtZr5Po1GA1VV0XW905BfScdCVFUlFovx22+/8c0331CpVGg2m7RaLRqNBoFAgGg0ysWLF7l69Soejwe3220+3263Y7PZ0HUdwzC4ceMG165dIxaLkUwmOw3vP9OxEMMwaDQaVKtVSqUS1WoVXdex2+2IokggEGBubo5z587h8/lwOp04nU7z+e0l8e+lJAgCNtvOGe33+wkGg8iy3GnIr6QrQqrVKrVaDVVV0TQNAFEUOXr0KHNzc3zxxRf4fD4URXnpg7ZpjzudTux2+0vXpqenmZmZQVGUTkN+JR0LcTqdhEIhRFEkn8/TbDaBF0Lcbjdzc3N4vV5EUdxTBrzYizRNI5/Pk8/nzRzFZrMhCALDw8OEw2FcLlenIb+SjoVIksTp06cJh8N88MEH5nh7KQiCgCiK+75Oo9GgVCqxtrZGMpmkWq0CmM+PRCJEo9Ed+08v6FhI+1sXBGHH3tC+Zp3+e7G+vs7PP//M77//TqlUQtM0BEEgHA4zPj7OzMwMJ06ceOk9uk1XErP2bHidmbAXd+7c4auvvkLTNHRdx+FwIIoiFy9eJBqN8u6773LixIluhPtKep66WzEMA13XqdfrlMtlM2FbWFgwN2S73U4kEiESifDhhx8SjUZ7vpm2OXAhuq6jaRrZbJbHjx/z448/cvPmTbLZrHm7djgcXLhwgY8//phPP/2UkydPHlh8B1LtGoZBsVhkdXXVrGWSySSrq6ssLS2Ry+Wo1+vAi3zD7/dz+vRpzp8/z8DAQK9D3MGBlf/JZJLvvvuO5eVlHjx4gKqqO1LzNkNDQ0xPT3PhwgUmJyd7fpu1ciBLRtd1CoUCjx49Yn19HVVVzXzFSiaTIRaLcevWLeLxOMeOHcPj8TA2NobX62VwcLCnkg5syWxtbfHgwQM0TTM31t3IZDJkMhnW19dxu90oioLH4+HKlSucP3+eS5cuIcvyK5O8Tuh6+f/SC/y9ZDY2Nrh9+zblcplCoUCxWCSXy5mPi8fjLC0tmT0UURRxOp1mv2RqaorR0VE++ugjzpw5w9mzZ/F6vQiC8Nq5joVdjfZciJVarWbKWFtbM8d/+uknfvnlF1ZXV0mn07s+t91Ri0QifP3110xOTiJJEoIgvEkovemH/FecTieKoiDL8o7O19DQEJcuXWJtbY10Ok0mkyGXy3H//n3i8TjwYrYlEgmq1SrPnz9ncHCQ48ePv6mQXTlwIQ6HA4fDgcvlwuv1muPDw8NMT09TrVbNW/TKygrZbNYUArC5uUkulyMejxMKhTh69Gh34+vqq3VAuxBsd9VlWSYYDBKPx/nrr79IJBJsb28DL2bKxsYG8XicU6dOdTWOQ3Mu0y4EJUkye63BYJDZ2VmmpqZ2pO6GYZgzR1XVrsZxaITsRbtwtI55vd6eVL+HXgjAbndCt9uN3+/v6oYKh2gPsVIsFtne3mZhYYGHDx/uyFlsNhunTp0iEol01HLYjUMppF0MJpNJEokEiURiR2YrCAJDQ0P4/f6uH2seOiHVapVKpcK9e/e4c+cOf/zxh3lEATA5Ocn4+DiBQABZlt80S92TQyOk/YHr9TrZbJZYLMb8/DypVMq8ZrfbCQaDRCIRvF4vDoej6zXNoRFSKBRIp9Pcvn2bu3fv8vjxY5LJpNknGRgYwO12c/XqVS5fvszo6Oiu5zed8laFtCthwzDI5/M8ffqUhw8fcuPGDbO3Ci820SNHjjA4OMi5c+cIhUI9kQFvUYiqqqiqysbGBktLS9y9e5f5+XkSiQTNZtNcJi6XC1mW+fLLL/nkk08Ih8M9kwFdFvLvb9yaULXH2383Gg0KhQLxeJz5+XkWFha4d++e+fj2me/AwACKonD27FneeecdPB5Pz2RAF4Vomka1WqVcLrO2toaiKIyMjJiH3pVKha2tLUqlEpubmywvL7O4uGjWKfl8HvgnM41EIoTDYa5cuUI0GiUUCqEoSk9/PQRdFNJqtSiVSmxtbRGLxRgdHUWSJPMXRLlcjuXlZba2tsxl0u6tqqq645RPFEXGx8cJh8NEo1FmZmbMhlGv6ZqQfD7Pt99+SzKZ5P79++Zhd6vVotVqmecwqqqaM6ZSqZgb5/DwMGNjY7z//vtmk/nkyZMoioIkSV3PN/aia0IajQaJRIJnz56xuLj4Wj9ssdlsSJKEJEmMjY0RiUQ4c+YM0WiUiYkJ/H5/t8J7bbomxOPxcPnyZTweD7/++uu+QmRZ5siRI3z++ed89tlnhEIhRkZGcLlcB7Y8dqNrQhwOB8FgkHQ6TSAQMI8l98LlcuHz+ZicnGR2dpahoaEdHbS3RdeazK1WyzxvKRaL+7/x3w0ht9ttdsm6XcrvF8KugwfddT9E9P+j6nXoC7HQF2KhL8TCfrfd3lVRh5T+DLHQF2KhL8RCX4iFvhALfSEW/gcMlBno19ugeQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "a_3 = stacked_threes[1]\n", "show_image(a_3);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Distance between the ideal three and other threes" ] }, { "cell_type": "code", "execution_count": 289, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(0.1114), tensor(0.2021))" ] }, "execution_count": 289, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dist_3_abs = (a_3 - mean3).abs().mean()\n", "dist_3_sqr = ((a_3 - mean3)**2).mean().sqrt()\n", "dist_3_abs,dist_3_sqr" ] }, { "cell_type": "code", "execution_count": 290, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(0.1586), tensor(0.3021))" ] }, "execution_count": 290, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dist_7_abs = (a_3 - mean7).abs().mean()\n", "dist_7_sqr = ((a_3 - mean7)**2).mean().sqrt()\n", "dist_7_abs,dist_7_sqr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: Then we need to calculate the distance between the 'ideal' and ordinary three.Two methods for getting the distance __L1 Norm__ and __MSE__ second one is panelize bigger mistake more havil, L1 is uniform.\n", "\n", "It is obvious that a_3 is closer to the perfect 3 so our approach worked at this time. (Both in L1 and MSE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pytorch L1 and MSE fuctions" ] }, { "cell_type": "code", "execution_count": 291, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(0.1586), tensor(0.3021))" ] }, "execution_count": 291, "metadata": {}, "output_type": "execute_result" } ], "source": [ "F.l1_loss(a_3.float(),mean7), F.mse_loss(a_3,mean7).sqrt()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: torch.nn.functional as F (for mse, manually take the sqrt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Important: (from notebook) If you don't know what C is, don't worry as you won't need it at all. In a nutshell, it's a low-level (low-level means more similar to the language that computers use internally) language that is very fast compared to Python. To take advantage of its speed while programming in Python, try to avoid as much as possible writing loops, and replace them by commands that work directly on arrays or tensors." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Array and Tensor Examples" ] }, { "cell_type": "code", "execution_count": 292, "metadata": {}, "outputs": [], "source": [ "data = [[1,2,3],[4,5,6]]\n", "arr = array (data)\n", "tns = tensor(data)" ] }, { "cell_type": "code", "execution_count": 293, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 293, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr # numpy" ] }, { "cell_type": "code", "execution_count": 294, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 294, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns # pytorch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Splitting, adding, multiplying tensors" ] }, { "cell_type": "code", "execution_count": 295, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([2, 5])" ] }, "execution_count": 295, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns[:,1]" ] }, { "cell_type": "code", "execution_count": 296, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([5, 6])" ] }, "execution_count": 296, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns[1,1:3]" ] }, { "cell_type": "code", "execution_count": 297, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[2, 3, 4],\n", " [5, 6, 7]])" ] }, "execution_count": 297, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns+1" ] }, { "cell_type": "code", "execution_count": 298, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'torch.LongTensor'" ] }, "execution_count": 298, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns.type()" ] }, { "cell_type": "code", "execution_count": 299, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[1.5000, 3.0000, 4.5000],\n", " [6.0000, 7.5000, 9.0000]])" ] }, "execution_count": 299, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tns*1.5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Validation set :Stacking Tensors" ] }, { "cell_type": "code", "execution_count": 300, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([1010, 28, 28]), torch.Size([1028, 28, 28]))" ] }, "execution_count": 300, "metadata": {}, "output_type": "execute_result" } ], "source": [ "valid_3_tens = torch.stack([tensor(Image.open(o)) \n", " for o in (path/'valid'/'3').ls()])\n", "valid_3_tens = valid_3_tens.float()/255\n", "valid_7_tens = torch.stack([tensor(Image.open(o)) \n", " for o in (path/'valid'/'7').ls()])\n", "valid_7_tens = valid_7_tens.float()/255\n", "valid_3_tens.shape,valid_7_tens.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Manual L1 distance function" ] }, { "cell_type": "code", "execution_count": 301, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(0.1114)" ] }, "execution_count": 301, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def mnist_distance(a,b): return (a-b).abs().mean((-1,-2))\n", "mnist_distance(a_3, mean3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### This is broadcasting:" ] }, { "cell_type": "code", "execution_count": 302, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor([0.1270, 0.1254, 0.1114, ..., 0.1494, 0.1097, 0.1365]),\n", " torch.Size([1010]))" ] }, "execution_count": 302, "metadata": {}, "output_type": "execute_result" } ], "source": [ "valid_3_dist = mnist_distance(valid_3_tens, mean3)\n", "valid_3_dist, valid_3_dist.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note:I think this an example of not using loops which slows down the process (check above important tag). Although shapes of the tensors don't match, out function still works. Pytorch fills the gaps." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__here is another example. Shapes don't match.__" ] }, { "cell_type": "code", "execution_count": 303, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([2, 3, 4])" ] }, "execution_count": 303, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tensor([1,2,3]) + tensor(1)" ] }, { "cell_type": "code", "execution_count": 304, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([1010, 28, 28])" ] }, "execution_count": 304, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(valid_3_tens-mean3).shape" ] }, { "cell_type": "code", "execution_count": 305, "metadata": {}, "outputs": [], "source": [ "def is_3(x): return mnist_distance(x,mean3) < mnist_distance(x,mean7)" ] }, { "cell_type": "code", "execution_count": 306, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(True), tensor(1.))" ] }, "execution_count": 306, "metadata": {}, "output_type": "execute_result" } ], "source": [ "is_3(a_3), is_3(a_3).float()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### here is an another broadcasting for all validation set:" ] }, { "cell_type": "code", "execution_count": 307, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([True, True, True, ..., True, True, True])" ] }, "execution_count": 307, "metadata": {}, "output_type": "execute_result" } ], "source": [ "is_3(valid_3_tens)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Accuracy of our 'ideal' 3 and 7" ] }, { "cell_type": "code", "execution_count": 308, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(0.9168), tensor(0.9854), tensor(0.9511))" ] }, "execution_count": 308, "metadata": {}, "output_type": "execute_result" } ], "source": [ "accuracy_3s = is_3(valid_3_tens).float() .mean()\n", "accuracy_7s = (1 - is_3(valid_7_tens).float()).mean()\n", "\n", "accuracy_3s,accuracy_7s,(accuracy_3s+accuracy_7s)/2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## STOCHASTIC GRADIENT DECENT (SGD)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Arthur Samues Machine Learning process:__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Initialize the weights.\n", "- For each image, use these weights to predict whether it appears to be a 3 or a 7.\n", "- Based on these predictions, calculate how good the model is (its loss).\n", "- Calculate the gradient, which measures for each weight, how changing that weight would change the loss (SGD)\n", "- Step (that is, change) all the weights based on that calculation.\n", "- Go back to the step 2, and repeat the process.\n", "- Iterate until you decide to stop the training process (for instance, because the model is good enough or you don't want to wait any longer).\n" ] }, { "cell_type": "code", "execution_count": 309, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "G\n", "\n", "\n", "init\n", "\n", "init\n", "\n", "\n", "predict\n", "\n", "predict\n", "\n", "\n", "init->predict\n", "\n", "\n", "\n", "\n", "loss\n", "\n", "loss\n", "\n", "\n", "predict->loss\n", "\n", "\n", "\n", "\n", "gradient\n", "\n", "gradient\n", "\n", "\n", "loss->gradient\n", "\n", "\n", "\n", "\n", "step\n", "\n", "step\n", "\n", "\n", "gradient->step\n", "\n", "\n", "\n", "\n", "step->predict\n", "\n", "\n", "repeat\n", "\n", "\n", "stop\n", "\n", "stop\n", "\n", "\n", "step->stop\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 309, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#id gradient_descent\n", "#caption The gradient descent process\n", "#alt Graph showing the steps for Gradient Descent\n", "gv('''\n", "init->predict->loss->gradient->step->stop\n", "step->predict[label=repeat]\n", "''')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### GD example" ] }, { "cell_type": "code", "execution_count": 310, "metadata": {}, "outputs": [], "source": [ "def f(x): return x**2" ] }, { "cell_type": "code", "execution_count": 311, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_function(f, 'x', 'x**2')\n", "plt.scatter(-1.5, f(-1.5), color='red');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to decrease the loss" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How to calculate gradient:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Now our tensor xt is under investigation. Pytorch will keeps its eye on it.__" ] }, { "cell_type": "code", "execution_count": 312, "metadata": {}, "outputs": [], "source": [ "xt = tensor(3.).requires_grad_()" ] }, { "cell_type": "code", "execution_count": 313, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(9., grad_fn=)" ] }, "execution_count": 313, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yt = f(xt)\n", "yt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Result is 9 but there is a grad function in the result.__\n", "***" ] }, { "cell_type": "code", "execution_count": 314, "metadata": {}, "outputs": [], "source": [ "yt.backward()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__backward calculates the derivative.__" ] }, { "cell_type": "code", "execution_count": 315, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(6.)" ] }, "execution_count": 315, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xt.grad" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__result is 6.__\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__now with a bigger tensor__" ] }, { "cell_type": "code", "execution_count": 316, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([ 3., 4., 10.], requires_grad=True)" ] }, "execution_count": 316, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xt = tensor([3.,4.,10.]).requires_grad_()\n", "xt" ] }, { "cell_type": "code", "execution_count": 317, "metadata": {}, "outputs": [], "source": [ "def f(x): return (x**2).sum()" ] }, { "cell_type": "code", "execution_count": 318, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(125., grad_fn=)" ] }, "execution_count": 318, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yt = f(xt)\n", "yt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__again we expect 2*xt:__" ] }, { "cell_type": "code", "execution_count": 319, "metadata": {}, "outputs": [], "source": [ "yt.backward()\n" ] }, { "cell_type": "code", "execution_count": 320, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([ 6., 8., 20.])" ] }, "execution_count": 320, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xt.grad" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### End to end SGD example" ] }, { "cell_type": "code", "execution_count": 321, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19.])" ] }, "execution_count": 321, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time = torch.arange(0,20).float()\n", "time" ] }, { "cell_type": "code", "execution_count": 322, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 322, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "speed = torch.randn(20)*3 + 0.75*(time-9.5)**2 + 1\n", "plt.scatter(time,speed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Now we are trying to come up with some parameters for our quadratic fuction that predicts speed any given time. Our choice is quadratic but that could be something else too. with a quadratic function our problem would be much easier.__ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__here is the function gets time and parameter as inputs and predicts a result:__" ] }, { "cell_type": "code", "execution_count": 323, "metadata": {}, "outputs": [], "source": [ "def f(t, params):\n", " a,b,c = params\n", " return a*(t**2) + (b*t) + c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__this our loss function that calculate distance between prediction and target( actual mesurements)__" ] }, { "cell_type": "code", "execution_count": 324, "metadata": {}, "outputs": [], "source": [ "def mse(preds, targets): return ((preds-targets)**2).mean().sqrt()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### __Step 1: here are initial random parameters:__" ] }, { "cell_type": "code", "execution_count": 325, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([ 0.9569, 0.0048, -0.1506], requires_grad=True)" ] }, "execution_count": 325, "metadata": {}, "output_type": "execute_result" } ], "source": [ "params = torch.randn(3).requires_grad_()\n", "params" ] }, { "cell_type": "code", "execution_count": 326, "metadata": {}, "outputs": [], "source": [ "orig_params = params.clone()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 2: calculate predictions:" ] }, { "cell_type": "code", "execution_count": 327, "metadata": {}, "outputs": [], "source": [ "preds = f(time,params)" ] }, { "cell_type": "code", "execution_count": 328, "metadata": {}, "outputs": [], "source": [ "def show_preds(preds, ax=None):\n", " if ax is None: ax=plt.subplots()[1]\n", " ax.scatter(time, speed)\n", " ax.scatter(time, to_np(preds), color='red')\n", " ax.set_ylim(-300,100)" ] }, { "cell_type": "code", "execution_count": 329, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "show_preds(preds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 3: Calculate the loss" ] }, { "cell_type": "code", "execution_count": 330, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(139.3082, grad_fn=)" ] }, "execution_count": 330, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loss = mse(preds,speed)\n", "loss" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "__The Question is how to improve these results:__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 4: first we calculate the gradient:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Pytorch makes it easier we just call the backward() on the loss but it calculates gradient for the params 'a' 'b' and 'c'.___" ] }, { "cell_type": "code", "execution_count": 331, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([165.0324, 10.5991, 0.6615])" ] }, "execution_count": 331, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loss.backward()\n", "params.grad # this is the derivative of the initial values in other word our slope." ] }, { "cell_type": "code", "execution_count": 332, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([1.6503e-03, 1.0599e-04, 6.6150e-06])" ] }, "execution_count": 332, "metadata": {}, "output_type": "execute_result" } ], "source": [ "params.grad * 1e-5 # scaler at the end is learning rate." ] }, { "cell_type": "code", "execution_count": 333, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([ 0.9569, 0.0048, -0.1506], requires_grad=True)" ] }, "execution_count": 333, "metadata": {}, "output_type": "execute_result" } ], "source": [ "params # they are still same." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "#### Step 5: Step the weight." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "we picked the learning rate 1e-5 very small step to avoid missing the lowest possible loss." ] }, { "cell_type": "code", "execution_count": 334, "metadata": {}, "outputs": [], "source": [ "lr = 1e-5\n", "params.data -= lr * params.grad.data\n", "params.grad = None" ] }, { "cell_type": "code", "execution_count": 335, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(139.0348, grad_fn=)" ] }, "execution_count": 335, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds = f(time,params)\n", "mse(preds, speed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "lets create a function for all these steps" ] }, { "cell_type": "code", "execution_count": 336, "metadata": {}, "outputs": [], "source": [ "def apply_step(params, prn=True):\n", " preds = f(time, params)\n", " loss = mse(preds, speed)\n", " loss.backward()\n", " params.data -= lr * params.grad.data\n", " params.grad = None\n", " if prn: print(loss.item())\n", " return preds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Step 6: repeat the step:" ] }, { "cell_type": "code", "execution_count": 337, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "139.03475952148438\n", "138.76133728027344\n", "138.4879150390625\n", "138.2145538330078\n", "137.94122314453125\n", "137.6679229736328\n", "137.39466857910156\n", "137.12144470214844\n", "136.84825134277344\n", "136.5751190185547\n" ] } ], "source": [ "for i in range(10): apply_step(params)" ] }, { "cell_type": "code", "execution_count": 338, "metadata": {}, "outputs": [], "source": [ "params = orig_params.detach().requires_grad_()" ] }, { "cell_type": "code", "execution_count": 339, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "_,axs = plt.subplots(1,4,figsize=(12,3))\n", "for ax in axs: show_preds(apply_step(params, False), ax)\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "## MNIST" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loss Function our 3 and 7 recognizer. Currently we use metric not loss" ] }, { "cell_type": "code", "execution_count": 340, "metadata": {}, "outputs": [], "source": [ "train_x = torch.cat([stacked_threes, stacked_sevens]).view(-1, 28*28)" ] }, { "cell_type": "code", "execution_count": 341, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([12396, 784])" ] }, "execution_count": 341, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_x.size()" ] }, { "cell_type": "code", "execution_count": 342, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([12396, 784]), torch.Size([12396, 1]))" ] }, "execution_count": 342, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_y = tensor([1]*len(threes) + [0]*len(sevens)).unsqueeze(1)\n", "train_x.shape,train_y.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How tensor manipulated " ] }, { "cell_type": "code", "execution_count": 343, "metadata": {}, "outputs": [], "source": [ "temp_tensor = tensor (1)" ] }, { "cell_type": "code", "execution_count": 344, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(1)" ] }, "execution_count": 344, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor" ] }, { "cell_type": "code", "execution_count": 345, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Tensor" ] }, "execution_count": 345, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(temp_tensor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__is above tensor is wrong what's the difference?__\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__we have a tensor__" ] }, { "cell_type": "code", "execution_count": 346, "metadata": {}, "outputs": [], "source": [ "temp_tensor = tensor([1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__then we multiuplied the inside of__" ] }, { "cell_type": "code", "execution_count": 347, "metadata": {}, "outputs": [], "source": [ "temp_tensor =tensor([1]*4)" ] }, { "cell_type": "code", "execution_count": 348, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([1, 1, 1, 1])" ] }, "execution_count": 348, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor" ] }, { "cell_type": "code", "execution_count": 349, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([4])" ] }, "execution_count": 349, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.shape" ] }, { "cell_type": "code", "execution_count": 350, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 350, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.ndim" ] }, { "cell_type": "code", "execution_count": 351, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([4])" ] }, "execution_count": 351, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.size()" ] }, { "cell_type": "code", "execution_count": 352, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[1],\n", " [1],\n", " [1],\n", " [1]])" ] }, "execution_count": 352, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(temp_tensor).unsqueeze(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Warning: __looked changed but why size is still unchanged why not [4,1]__" ] }, { "cell_type": "code", "execution_count": 353, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([4])" ] }, "execution_count": 353, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.shape" ] }, { "cell_type": "code", "execution_count": 354, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([4])" ] }, "execution_count": 354, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.size()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How unsqueeze works?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Warning: Whaaaaaaaaaaaaat?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(temp_tensor).unsqueeze(1) doesn't work but (temp_tensor*1).unsqueeze(1) you need to unsqueeze it when creating otherwise it doesnt work. I do not believe it." ] }, { "cell_type": "code", "execution_count": 355, "metadata": {}, "outputs": [], "source": [ "temp_tensor = tensor([1]).unsqueeze(1)" ] }, { "cell_type": "code", "execution_count": 356, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([1, 1])" ] }, "execution_count": 356, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_tensor.shape" ] }, { "cell_type": "code", "execution_count": 357, "metadata": {}, "outputs": [], "source": [ "temp_tensor =tensor([1]*1).unsqueeze(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dataset" ] }, { "cell_type": "code", "execution_count": 358, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([784]), 1, tensor([1]))" ] }, "execution_count": 358, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dset = list(zip(train_x,train_y))\n", "x,y = dset[0]\n", "x.shape,x.ndim,y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__we create list of tuples, each tuple contains a image and a target__" ] }, { "cell_type": "code", "execution_count": 359, "metadata": {}, "outputs": [], "source": [ "valid_x = torch.cat([valid_3_tens, valid_7_tens]).view(-1, 28*28)\n", "valid_y = tensor([1]*len(valid_3_tens) + [0]*len(valid_7_tens)).unsqueeze(1)\n", "valid_dset = list(zip(valid_x,valid_y))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__same for validation__\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Weights" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__this is not clear on the videos but consider a layer NN of 728 inputs and 1 output.__" ] }, { "cell_type": "code", "execution_count": 360, "metadata": {}, "outputs": [], "source": [ "def init_params(size, std=1.0): return (torch.randn(size)*std).requires_grad_()" ] }, { "cell_type": "code", "execution_count": 361, "metadata": {}, "outputs": [], "source": [ "weights = init_params((28*28,1))" ] }, { "cell_type": "code", "execution_count": 362, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([784, 1])" ] }, "execution_count": 362, "metadata": {}, "output_type": "execute_result" } ], "source": [ "weights.shape" ] }, { "cell_type": "code", "execution_count": 363, "metadata": {}, "outputs": [], "source": [ "bias = init_params(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: The function `weights*pixels` won't be flexible enough—it is always equal to 0 when the pixels are equal to 0 (i.e., its *intercept* is 0). You might remember from high school math that the formula for a line is `y=w*x+b`; we still need the `b`. We'll initialize it to a random number too:" ] }, { "cell_type": "code", "execution_count": 364, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([0.0959], requires_grad=True)" ] }, "execution_count": 364, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bias" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Again transposing the weight matrix is not clear but Tariq Rashed's book would be very beneficial at this point__" ] }, { "cell_type": "code", "execution_count": 365, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([-5.6867], grad_fn=)" ] }, "execution_count": 365, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(train_x[0]*weights.T).sum() + bias" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__for all dataset put this multiplication in a function__" ] }, { "cell_type": "code", "execution_count": 366, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[ -5.6867],\n", " [ -6.5451],\n", " [ -2.0241],\n", " ...,\n", " [-14.3286],\n", " [ 4.3505],\n", " [-12.6773]], grad_fn=)" ] }, "execution_count": 366, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def linear1(xb): return xb@weights + bias\n", "preds = linear1(train_x)\n", "preds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Create a tensor with results based on their value (above 0.5 is 7 and below it is 3)__" ] }, { "cell_type": "code", "execution_count": 367, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[False],\n", " [False],\n", " [False],\n", " ...,\n", " [ True],\n", " [False],\n", " [ True]])" ] }, "execution_count": 367, "metadata": {}, "output_type": "execute_result" } ], "source": [ "corrects = (preds>0.5).float() == train_y\n", "corrects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "__check it__" ] }, { "cell_type": "code", "execution_count": 368, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.4636172950267792" ] }, "execution_count": 368, "metadata": {}, "output_type": "execute_result" } ], "source": [ "corrects.float().mean().item()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__almost half of them is 3 and the other half is 7 (since weighs are totally random)__ \n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why we need a loss Function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Basically we need to have gradients for correcting our weighs, we need to know which direction we need to go__ \n", "\n", "\n", "If you dont understand all of these, ckeck khan academy for gradient." ] }, { "cell_type": "code", "execution_count": 369, "metadata": {}, "outputs": [], "source": [ "trgts = tensor([1,0,1])\n", "prds = tensor([0.9, 0.4, 0.2])" ] }, { "cell_type": "code", "execution_count": 370, "metadata": {}, "outputs": [], "source": [ "def mnist_loss(predictions, targets):\n", " return torch.where(targets==1, 1-predictions, predictions).mean()" ] }, { "cell_type": "code", "execution_count": 371, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([0.1000, 0.4000, 0.8000])" ] }, "execution_count": 371, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.where(trgts==1, 1-prds, prds)" ] }, { "cell_type": "code", "execution_count": 372, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(0.4333)" ] }, "execution_count": 372, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mnist_loss(prds,trgts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sigmoid" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__We need this for squishing predictions between 0-1__" ] }, { "cell_type": "code", "execution_count": 373, "metadata": {}, "outputs": [], "source": [ "def sigmoid(x): return 1/(1+torch.exp(-x))" ] }, { "cell_type": "code", "execution_count": 374, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_function(torch.sigmoid, title='Sigmoid', min=-4, max=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__update the fuction with the sigmoid thats all.__" ] }, { "cell_type": "code", "execution_count": 375, "metadata": {}, "outputs": [], "source": [ "def mnist_loss(predictions, targets):\n", " predictions = predictions.sigmoid()\n", " return torch.where(targets==1, 1-predictions, predictions).mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What are SGD and Mini-Batches" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__This explains most of it.__" ] }, { "cell_type": "code", "execution_count": 376, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[tensor([ 0, 2, 10, 13, 8]),\n", " tensor([11, 12, 4, 1, 5]),\n", " tensor([ 3, 14, 6, 9, 7])]" ] }, "execution_count": 376, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coll = range(15)\n", "dl = DataLoader(coll, batch_size=5, shuffle=True)\n", "list(dl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__but this is only a list however we neeed a tuple consist of independent and dependent variable.__" ] }, { "cell_type": "code", "execution_count": 377, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(#26) [(0, 'a'),(1, 'b'),(2, 'c'),(3, 'd'),(4, 'e'),(5, 'f'),(6, 'g'),(7, 'h'),(8, 'i'),(9, 'j')...]" ] }, "execution_count": 377, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds = L(enumerate(string.ascii_lowercase))\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### DataLoader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__then put it into a Dataloader.__" ] }, { "cell_type": "code", "execution_count": 378, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(tensor([ 1, 23, 9, 8, 24, 2]), ('b', 'x', 'j', 'i', 'y', 'c')),\n", " (tensor([14, 25, 13, 11, 19, 5]), ('o', 'z', 'n', 'l', 't', 'f')),\n", " (tensor([ 0, 10, 4, 7, 18, 12]), ('a', 'k', 'e', 'h', 's', 'm')),\n", " (tensor([ 6, 21, 15, 16, 22, 3]), ('g', 'v', 'p', 'q', 'w', 'd')),\n", " (tensor([20, 17]), ('u', 'r'))]" ] }, "execution_count": 378, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dl = DataLoader(ds, batch_size=6, shuffle=True)\n", "list(dl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__now we have batches and tuples__\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__all together__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's time to implement the process we saw in <>. In code, our process will be implemented something like this for each epoch:\n", "\n", "```python\n", "for x,y in dl:\n", " pred = model(x)\n", " loss = loss_func(pred, y)\n", " loss.backward()\n", " parameters -= parameters.grad * lr\n", "```" ] }, { "cell_type": "code", "execution_count": 379, "metadata": {}, "outputs": [], "source": [ "weights = init_params((28*28,1))\n", "bias = init_params(1)" ] }, { "cell_type": "code", "execution_count": 380, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([256, 784]), torch.Size([256, 1]))" ] }, "execution_count": 380, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dl = DataLoader(dset, batch_size=256)\n", "xb,yb = first(dl)\n", "xb.shape,yb.shape" ] }, { "cell_type": "code", "execution_count": 381, "metadata": {}, "outputs": [], "source": [ "valid_dl = DataLoader(valid_dset, batch_size=256)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__a small test__" ] }, { "cell_type": "code", "execution_count": 382, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([4, 784])" ] }, "execution_count": 382, "metadata": {}, "output_type": "execute_result" } ], "source": [ "batch = train_x[:4]\n", "batch.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "__predictions__" ] }, { "cell_type": "code", "execution_count": 383, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[ 8.0575],\n", " [14.3841],\n", " [-3.8017],\n", " [ 5.1179]], grad_fn=)" ] }, "execution_count": 383, "metadata": {}, "output_type": "execute_result" } ], "source": [ "preds = linear1(batch)\n", "preds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__loss__" ] }, { "cell_type": "code", "execution_count": 384, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(0.2461, grad_fn=)" ] }, "execution_count": 384, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loss = mnist_loss(preds, train_y[:4])\n", "loss" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__gradients__" ] }, { "cell_type": "code", "execution_count": 385, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([784, 1]), tensor(-0.0010), tensor([-0.0069]))" ] }, "execution_count": 385, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loss.backward()\n", "weights.grad.shape,weights.grad.mean(),bias.grad" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__for the step we need a optimizer__\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__put all into a function except the optimizer.__" ] }, { "cell_type": "code", "execution_count": 386, "metadata": {}, "outputs": [], "source": [ "def calc_grad(xb, yb, model):\n", " preds = model(xb)\n", " loss = mnist_loss(preds, yb)\n", " loss.backward()" ] }, { "cell_type": "code", "execution_count": 387, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(-0.0021), tensor([-0.0138]))" ] }, "execution_count": 387, "metadata": {}, "output_type": "execute_result" } ], "source": [ "calc_grad(batch, train_y[:4], linear1)\n", "weights.grad.mean(),bias.grad" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Warning: if you do it twice results are change." ] }, { "cell_type": "code", "execution_count": 388, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(tensor(-0.0031), tensor([-0.0207]))" ] }, "execution_count": 388, "metadata": {}, "output_type": "execute_result" } ], "source": [ "calc_grad(batch, train_y[:4], linear1)\n", "weights.grad.mean(),bias.grad" ] }, { "cell_type": "code", "execution_count": 389, "metadata": {}, "outputs": [], "source": [ "weights.grad.zero_()\n", "bias.grad.zero_();" ] }, { "cell_type": "code", "execution_count": 390, "metadata": {}, "outputs": [], "source": [ "def train_epoch(model, lr, params):\n", " for xb,yb in dl:\n", " calc_grad(xb, yb, model)\n", " for p in params:\n", " p.data -= p.grad*lr\n", " p.grad.zero_()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__little conversion to our results, it's important because we need to understand that what our model says about the numbers(three or not three)__" ] }, { "cell_type": "code", "execution_count": 391, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[ True],\n", " [ True],\n", " [False],\n", " [ True]])" ] }, "execution_count": 391, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(preds>0.0).float() == train_y[:4]" ] }, { "cell_type": "code", "execution_count": 392, "metadata": {}, "outputs": [], "source": [ "def batch_accuracy(xb, yb):\n", " preds = xb.sigmoid()\n", " correct = (preds>0.5) == yb\n", " return correct.float().mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "__this is training accuracy__" ] }, { "cell_type": "code", "execution_count": 393, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor(0.7500)" ] }, "execution_count": 393, "metadata": {}, "output_type": "execute_result" } ], "source": [ "batch_accuracy(linear1(batch), train_y[:4])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__this is for validation for all set__" ] }, { "cell_type": "code", "execution_count": 394, "metadata": {}, "outputs": [], "source": [ "def validate_epoch(model):\n", " accs = [batch_accuracy(model(xb), yb) for xb,yb in valid_dl]\n", " return round(torch.stack(accs).mean().item(), 4)" ] }, { "cell_type": "code", "execution_count": 395, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5136" ] }, "execution_count": 395, "metadata": {}, "output_type": "execute_result" } ], "source": [ "validate_epoch(linear1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__one epochs of training__" ] }, { "cell_type": "code", "execution_count": 396, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7121" ] }, "execution_count": 396, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lr = 1.\n", "params = weights,bias\n", "train_epoch(linear1, lr, params)\n", "validate_epoch(linear1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__then more__" ] }, { "cell_type": "code", "execution_count": 397, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.8656 0.9203 0.9457 0.9549 0.9593 0.9623 0.9652 0.9666 0.9681 0.9705 0.9706 0.9711 0.972 0.973 0.9735 0.9735 0.974 0.9745 0.9755 0.9755 " ] } ], "source": [ "for i in range(20):\n", " train_epoch(linear1, lr, params)\n", " print(validate_epoch(linear1), end=' ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optimizer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Let's start creating our model with Pytorch instead of our \"linear1\" function. Pytorch also creates parameters like our init_params function.__\n" ] }, { "cell_type": "code", "execution_count": 398, "metadata": {}, "outputs": [], "source": [ "linear_model = nn.Linear(28*28,1)" ] }, { "cell_type": "code", "execution_count": 399, "metadata": {}, "outputs": [], "source": [ "w,b = linear_model.parameters()" ] }, { "cell_type": "code", "execution_count": 400, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(torch.Size([1, 784]), torch.Size([1]))" ] }, "execution_count": 400, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w.shape, b.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Custom optimizer__" ] }, { "cell_type": "code", "execution_count": 401, "metadata": {}, "outputs": [], "source": [ "class BasicOptim:\n", " def __init__(self,params,lr): self.params,self.lr = list(params),lr\n", "\n", " def step(self, *args, **kwargs):\n", " for p in self.params: p.data -= p.grad.data * self.lr\n", "\n", " def zero_grad(self, *args, **kwargs):\n", " for p in self.params: p.grad = None" ] }, { "cell_type": "code", "execution_count": 402, "metadata": {}, "outputs": [], "source": [ "opt = BasicOptim(linear_model.parameters(), lr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__new training fuction will be__" ] }, { "cell_type": "code", "execution_count": 403, "metadata": {}, "outputs": [], "source": [ "def train_epoch(model):\n", " for xb,yb in dl:\n", " calc_grad(xb, yb, model)\n", " opt.step()\n", " opt.zero_grad()" ] }, { "cell_type": "code", "execution_count": 404, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.4078" ] }, "execution_count": 404, "metadata": {}, "output_type": "execute_result" } ], "source": [ "validate_epoch(linear_model)" ] }, { "cell_type": "code", "execution_count": 405, "metadata": {}, "outputs": [], "source": [ "def train_model(model, epochs):\n", " for i in range(epochs):\n", " train_epoch(model)\n", " print(validate_epoch(model), end=' ')" ] }, { "cell_type": "code", "execution_count": 406, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.4932 0.8193 0.8418 0.9136 0.9331 0.9477 0.9555 0.9629 0.9658 0.9673 0.9697 0.9717 0.9736 0.9751 0.9761 0.9761 0.9775 0.9775 0.9785 0.9785 " ] } ], "source": [ "train_model(linear_model, 20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fastai's SDG class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__instead of using \"BasicOptim\" class we can use fastai's SGD class__" ] }, { "cell_type": "code", "execution_count": 407, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.4932 0.7808 0.8623 0.9185 0.9365 0.9521 0.9575 0.9638 0.9658 0.9678 0.9707 0.9726 0.9741 0.9751 0.9761 0.9765 0.9775 0.978 0.9785 0.9785 " ] } ], "source": [ "linear_model = nn.Linear(28*28,1)\n", "opt = SGD(linear_model.parameters(), lr)\n", "train_model(linear_model, 20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Just remove the \"train_model\" at this time and use fastai's \"Learner.fit\" Before using Learner first we need to pass our trainig and validation data into \"Dataloaders\" not \"dataloader\"__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Fastai's Dataloaders" ] }, { "cell_type": "code", "execution_count": 408, "metadata": {}, "outputs": [], "source": [ "dls = DataLoaders(dl, valid_dl)" ] }, { "cell_type": "code", "execution_count": 409, "metadata": {}, "outputs": [], "source": [ "learn = Learner(dls, nn.Linear(28*28,1), opt_func=SGD,\n", " loss_func=mnist_loss, metrics=batch_accuracy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### FastAi's Fit" ] }, { "cell_type": "code", "execution_count": 410, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossbatch_accuracytime
00.6371660.5035750.49558400:00
10.5622320.1397270.90039300:00
20.2045520.2079350.80618300:00
30.0889040.1147670.90480900:00
40.0463270.0816020.93032400:00
50.0297540.0645300.94455300:00
60.0229630.0541350.95485800:00
70.0199660.0472930.96123600:00
80.0184640.0425150.96516200:00
90.0175730.0390110.96663400:00
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "learn.fit(10, lr=lr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding a Nonlinearity" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__The basic idea is that by using more linear layers, we can have our model do more computation, and therefore model more complex functions. But there's no point just putting one linear layer directly after another one, because when we multiply things together and then add them up multiple times, that could be replaced by multiplying different things together and adding them up just once! That is to say, a series of any number of linear layers in a row can be replaced with a single linear layer with a different set of parameters.__ (From Fastbook)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Amazingly enough, it can be mathematically proven that this little function can solve any computable problem to an arbitrarily high level of accuracy, if you can find the right parameters for w1 and w2 and if you make these matrices big enough. For any arbitrarily wiggly function, we can approximate it as a bunch of lines joined together; to make it closer to the wiggly function, we just have to use shorter lines. This is known as the __universal approximation theorem.___ The three lines of code that we have here are known as layers. The first and third are known as linear layers, and the second line of code is known variously as a nonlinearity, or activation function.(From Fastbook)" ] }, { "cell_type": "code", "execution_count": 411, "metadata": {}, "outputs": [], "source": [ "simple_net = nn.Sequential(\n", " nn.Linear(28*28,30),\n", " nn.ReLU(),\n", " nn.Linear(30,1)\n", ")" ] }, { "cell_type": "code", "execution_count": 412, "metadata": {}, "outputs": [], "source": [ "learn = Learner(dls, simple_net, opt_func=SGD,\n", " loss_func=mnist_loss, metrics=batch_accuracy)" ] }, { "cell_type": "code", "execution_count": 413, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossbatch_accuracytime
00.3032840.3983780.51177600:00
10.1423840.2215170.81795900:00
20.0797020.1126100.91707600:00
30.0528550.0764740.94210000:00
40.0403010.0597910.95878300:00
50.0338240.0503890.96418100:00
60.0300750.0444830.96614300:00
70.0276290.0404650.96663400:00
80.0258650.0375530.96957800:00
90.0244990.0353360.97154100:00
100.0233910.0335790.97252200:00
110.0224670.0321420.97399400:00
120.0216790.0309360.97399400:00
130.0209970.0299010.97497500:00
140.0203980.0289980.97497500:00
150.0198690.0281980.97595700:00
160.0193950.0274840.97644800:00
170.0189660.0268410.97693800:00
180.0185770.0262590.97742900:00
190.0182200.0257300.97742900:00
200.0178920.0252440.97841000:00
210.0175880.0247990.97988200:00
220.0173060.0243880.97988200:00
230.0170420.0240080.98037300:00
240.0167940.0236560.98086400:00
250.0165610.0233280.98086400:00
260.0163410.0230220.98086400:00
270.0161330.0227370.98184500:00
280.0159350.0224700.98184500:00
290.0157460.0222210.98184500:00
300.0155660.0219880.98233600:00
310.0153950.0217690.98233600:00
320.0152310.0215650.98282600:00
330.0150760.0213710.98282600:00
340.0149250.0211900.98282600:00
350.0147820.0210180.98282600:00
360.0146430.0208560.98282600:00
370.0145100.0207030.98282600:00
380.0143820.0205580.98282600:00
390.0142580.0204200.98282600:00
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "learn.fit(40, 0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### recorder is a fast ai method" ] }, { "cell_type": "code", "execution_count": 414, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(L(learn.recorder.values).itemgot(2));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Last value__" ] }, { "cell_type": "code", "execution_count": 415, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.982826292514801" ] }, "execution_count": 415, "metadata": {}, "output_type": "execute_result" } ], "source": [ "learn.recorder.values[-1][2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GOING DEEPER" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__why deeper if it is two and a nonlinear between them is enough__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We already know that a single nonlinearity with two linear layers is enough to approximate any function. So why would we use deeper models? The reason is performance. With a deeper model (that is, one with more layers) we do not need to use as many parameters; it turns out that we can use smaller matrices with more layers, and get better results than we would get with larger matrices, and few layers." ] }, { "cell_type": "code", "execution_count": 416, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
epochtrain_lossvalid_lossaccuracytime
00.0897270.0117550.99705600:13
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "dls = ImageDataLoaders.from_folder(path)\n", "learn = cnn_learner(dls, resnet18, pretrained=False,\n", " loss_func=F.cross_entropy, metrics=accuracy)\n", "learn.fit_one_cycle(1, 0.1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }