{ "cells": [ { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-04-07T02:19:02.272939Z", "start_time": "2019-04-07T02:19:02.268259Z" }, "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "\n", "# 第七章 神经网络与深度学习\n", "\n", "\n", "\n", "\n", "![image.png](./images/author.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural.png)\n", "\n", "Sung Kim HKUST\n", "\n", "- Code: https://github.com/hunkim/PyTorchZeroToAll \n", "- Slides: http://bit.ly/PyTorchZeroAll \n", "- Videos: https://www.bilibili.com/video/av15823922/\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural2.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "\n", "**The Neuron: A Biological Information Processor**\n", "\n", "- dentrites - the receivers\n", "- soma - neuron cell body (sums input signals)\n", "- axon - the transmitter\n", "- synapse 突触 - point of transmission\n", "\n", "Neuron activates after a certain threshold is met.\n", "\n", "Learning occurs via electro-chemical changes in effectiveness of synaptic junction. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "\n", "**An Artificial Neuron: The Perceptron simulated on hardware or by software**. Learning occurs via changes in value of the connection weights. \n", "\n", "- input connections - the receivers\n", "- node simulates neuron body\n", "- output connection - the transmitter\n", "- **activation function** employs a threshold or bias\n", "- connection weights act as synaptic junctions (突触)\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "\n", "Neural Networks consist of the following components\n", "- An **input layer**, **x**\n", "- An arbitrary amount of **hidden layers**\n", "- An **output layer**, **ŷ**\n", "- A set of **weights** and **biases** between each layer, **W and b**\n", "- A choice of **activation function** for each hidden layer, **σ**. \n", " - e.g., Sigmoid activation function." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Each iteration of the training process consists of the following steps:\n", "\n", "1. Calculating the predicted output **ŷ**, known as `feedforward`\n", "1. Updating the weights and biases, known as `backpropagation`\n", "\n", "![image.png](./images/neural4.png)\n", "\n", "**activation function** for each hidden layer, **σ**. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural5.png)\n", "\n", "https://blog.ttro.com/artificial-intelligence-will-shape-e-learning-for-good/ \n" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-04-07T05:41:09.971531Z", "start_time": "2019-04-07T05:41:09.967172Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural6.png)\n", "\n", "http://playground.tensorflow.org/" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural7.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Batch, Iteration, & Epoch\n", "\n", "Batch Size is the total number of training examples present in a single batch.\n", "\n", "![image.png](./images/neural8.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Note: The number of batches is equal to number of iterations for one epoch. Batch size and number of batches (iterations) are two different things.\n", "\n", "\n", "Let’s say we have 2000 training examples that we are going to use .\n", "\n", "We can divide the dataset of 2000 examples into batches of 500 then it will take 4 iterations to complete 1 epoch.\n", "\n", "Where Batch Size is 500 and Iterations is 4, for 1 complete epoch." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Gradient Descent\n", "\n", "![](images/gradient.gif)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/neural9.png)\n", "\n", "Let's represent parameters as $\\Theta$, learning rate as $\\alpha$, and gradient as $\\bigtriangledown J(\\Theta)$, " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Mannual Gradient" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:38:13.578136Z", "start_time": "2020-08-13T03:38:13.345432Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import seaborn as sns\n", "sns.set()\n", "\n", "x_data = [1.0, 2.0, 3.0]\n", "y_data = [2.0, 4.0, 6.0]\n", "\n", "plt.plot(x_data, y_data, 'r-o');" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:38:14.210339Z", "start_time": "2020-08-13T03:38:14.207126Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# our model for the forward pass\n", "def forward(x):\n", " return x * w\n", "\n", "# Loss function\n", "def loss(y_pred, y_val):\n", " return (y_pred - y_val) ** 2\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:38:14.860718Z", "start_time": "2020-08-13T03:38:14.855815Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# List of weights/Mean square Error (Mse) for each input\n", "w_list = []\n", "mse_list = []\n", "\n", "for w in np.arange(0.0, 4.1, 0.1):\n", " # Print the weights and initialize the lost\n", " #print(\"w=\", w)\n", " l_sum = 0\n", " for x_val, y_val in zip(x_data, y_data):\n", " # For each input and output, calculate y_hat\n", " # Compute the total loss and add to the total error\n", " y_pred = forward(x_val)\n", " l = loss(y_pred, y_val)\n", " l_sum += l\n", " #print(\"\\t\", x_val, y_val, y_pred_val, l)\n", " # Now compute the Mean squared error (mse) of each\n", " # Aggregate the weight/mse from this run\n", " #print(\"MSE=\", l_sum / 3)\n", " w_list.append(w)\n", " mse_list.append(l_sum / 3)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:38:15.764701Z", "start_time": "2020-08-13T03:38:15.525889Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot it all\n", "plt.plot(w_list, mse_list)\n", "plt.ylabel('Loss')\n", "plt.xlabel('w')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:40:03.659167Z", "start_time": "2020-08-13T03:40:03.608503Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 0 w= 2.0 loss= 0.0\r", "Epoch: 1 w= 2.0 loss= 0.0\r", "Epoch: 2 w= 2.0 loss= 0.0\r", "Epoch: 3 w= 2.0 loss= 0.0\r", "Epoch: 4 w= 2.0 loss= 0.0\r", "Epoch: 5 w= 2.0 loss= 0.0\r", "Epoch: 6 w= 2.0 loss= 0.0\r", "Epoch: 7 w= 2.0 loss= 0.0\r", "Epoch: 8 w= 2.0 loss= 0.0\r", "Epoch: 9 w= 2.0 loss= 0.0\r", "Epoch: 10 w= 2.0 loss= 0.0\r", "Epoch: 11 w= 2.0 loss= 0.0\r", "Epoch: 12 w= 2.0 loss= 0.0\r", "Epoch: 13 w= 2.0 loss= 0.0\r", "Epoch: 14 w= 2.0 loss= 0.0\r", "Epoch: 15 w= 2.0 loss= 0.0\r", "Epoch: 16 w= 2.0 loss= 0.0\r", "Epoch: 17 w= 2.0 loss= 0.0\r", "Epoch: 18 w= 2.0 loss= 0.0\r", "Epoch: 19 w= 2.0 loss= 0.0\r", "Epoch: 20 w= 2.0 loss= 0.0\r", "Epoch: 21 w= 2.0 loss= 0.0\r", "Epoch: 22 w= 2.0 loss= 0.0\r", "Epoch: 23 w= 2.0 loss= 0.0\r", "Epoch: 24 w= 2.0 loss= 0.0\r", "Epoch: 25 w= 2.0 loss= 0.0\r", "Epoch: 26 w= 2.0 loss= 0.0\r", "Epoch: 27 w= 2.0 loss= 0.0\r", "Epoch: 28 w= 2.0 loss= 0.0\r", "Epoch: 29 w= 2.0 loss= 0.0\r", "Epoch: 30 w= 2.0 loss= 0.0\r", "Epoch: 31 w= 2.0 loss= 0.0\r", "Epoch: 32 w= 2.0 loss= 0.0\r", "Epoch: 33 w= 2.0 loss= 0.0\r", "Epoch: 34 w= 2.0 loss= 0.0\r", "Epoch: 35 w= 2.0 loss= 0.0\r", "Epoch: 36 w= 2.0 loss= 0.0\r", "Epoch: 37 w= 2.0 loss= 0.0\r", "Epoch: 38 w= 2.0 loss= 0.0\r", "Epoch: 39 w= 2.0 loss= 0.0\r", "Epoch: 40 w= 2.0 loss= 0.0\r", "Epoch: 41 w= 2.0 loss= 0.0\r", "Epoch: 42 w= 2.0 loss= 0.0\r", "Epoch: 43 w= 2.0 loss= 0.0\r", "Epoch: 44 w= 2.0 loss= 0.0\r", "Epoch: 45 w= 2.0 loss= 0.0\r", "Epoch: 46 w= 2.0 loss= 0.0\r", "Epoch: 47 w= 2.0 loss= 0.0\r", "Epoch: 48 w= 2.0 loss= 0.0\r", "Epoch: 49 w= 2.0 loss= 0.0\r", "Epoch: 50 w= 2.0 loss= 0.0\r", "Epoch: 51 w= 2.0 loss= 0.0\r", "Epoch: 52 w= 2.0 loss= 0.0\r", "Epoch: 53 w= 2.0 loss= 0.0\r", "Epoch: 54 w= 2.0 loss= 0.0\r", "Epoch: 55 w= 2.0 loss= 0.0\r", "Epoch: 56 w= 2.0 loss= 0.0\r", "Epoch: 57 w= 2.0 loss= 0.0\r", "Epoch: 58 w= 2.0 loss= 0.0\r", "Epoch: 59 w= 2.0 loss= 0.0\r", "Epoch: 60 w= 2.0 loss= 0.0\r", "Epoch: 61 w= 2.0 loss= 0.0\r", "Epoch: 62 w= 2.0 loss= 0.0\r", "Epoch: 63 w= 2.0 loss= 0.0\r", "Epoch: 64 w= 2.0 loss= 0.0\r", "Epoch: 65 w= 2.0 loss= 0.0\r", "Epoch: 66 w= 2.0 loss= 0.0\r", "Epoch: 67 w= 2.0 loss= 0.0\r", "Epoch: 68 w= 2.0 loss= 0.0\r", "Epoch: 69 w= 2.0 loss= 0.0\r", "Epoch: 70 w= 2.0 loss= 0.0\r", "Epoch: 71 w= 2.0 loss= 0.0\r", "Epoch: 72 w= 2.0 loss= 0.0\r", "Epoch: 73 w= 2.0 loss= 0.0\r", "Epoch: 74 w= 2.0 loss= 0.0\r", "Epoch: 75 w= 2.0 loss= 0.0\r", "Epoch: 76 w= 2.0 loss= 0.0\r", "Epoch: 77 w= 2.0 loss= 0.0\r", "Epoch: 78 w= 2.0 loss= 0.0\r", "Epoch: 79 w= 2.0 loss= 0.0\r", "Epoch: 80 w= 2.0 loss= 0.0\r", "Epoch: 81 w= 2.0 loss= 0.0\r", "Epoch: 82 w= 2.0 loss= 0.0\r", "Epoch: 83 w= 2.0 loss= 0.0\r", "Epoch: 84 w= 2.0 loss= 0.0\r", "Epoch: 85 w= 2.0 loss= 0.0\r", "Epoch: 86 w= 2.0 loss= 0.0\r", "Epoch: 87 w= 2.0 loss= 0.0\r", "Epoch: 88 w= 2.0 loss= 0.0\r", "Epoch: 89 w= 2.0 loss= 0.0\r", "Epoch: 90 w= 2.0 loss= 0.0\r", "Epoch: 91 w= 2.0 loss= 0.0\r", "Epoch: 92 w= 2.0 loss= 0.0\r", "Epoch: 93 w= 2.0 loss= 0.0\r", "Epoch: 94 w= 2.0 loss= 0.0\r", "Epoch: 95 w= 2.0 loss= 0.0\r", "Epoch: 96 w= 2.0 loss= 0.0\r", "Epoch: 97 w= 2.0 loss= 0.0\r", "Epoch: 98 w= 2.0 loss= 0.0\r", "Epoch: 99 w= 2.0 loss= 0.0\r" ] } ], "source": [ "# compute gradient\n", "def gradient(x, y): # d_loss/d_w\n", " return 2 * x * (x * w - y)\n", "\n", "# Training loop\n", "for epoch in range(100):\n", " for x_val, y_val in zip(x_data, y_data):\n", " # Compute derivative w.r.t to the learned weights\n", " # Update the weights\n", " # Compute the loss and print progress\n", " grad = gradient(x_val, y_val)\n", " w = w - 0.01 * grad\n", " #print(\"\\tgrad: \", x_val, y_val, round(grad, 2))\n", " y_pred = forward(x_val)\n", " l = loss(y_pred, y_val)\n", " print(\"Epoch:\", epoch, \"w=\", round(w, 2), \"loss=\", round(l, 2), end='\\r')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Auto Gradient" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:41:27.541561Z", "start_time": "2020-08-13T03:41:26.706805Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "import torch\n", "w = torch.tensor([1.0], requires_grad=True)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:44:23.779127Z", "start_time": "2020-08-13T03:44:23.716863Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 99 | Loss: 9.094947017729282e-13\n" ] } ], "source": [ "# Training loop\n", "for epoch in range(100):\n", " for x_val, y_val in zip(x_data, y_data):\n", " y_pred = forward(x_val) # 1) Forward pass\n", " l = loss(y_pred, y_val) # 2) Compute loss\n", " l.backward() # 3) Back propagation to update weights\n", " #print(\"\\tgrad: \", x_val, y_val, w.grad.item())\n", " w.data = w.data - 0.01 * w.grad.item()\n", " # Manually zero the gradients after updating weights\n", " w.grad.data.zero_()\n", "\n", "print(f\"Epoch: {epoch} | Loss: {l.item()}\")" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:44:38.300392Z", "start_time": "2020-08-13T03:44:38.272470Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "tensor([2.0000])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w.data" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Back Propagation in Complicated network\n", "\n", "\n", "![image.png](./images/nn10.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn11.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn12.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn13.png)\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:53:35.218082Z", "start_time": "2020-08-13T03:53:35.214175Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from torch import nn\n", "import torch\n", "from torch import tensor\n", "from torch import sigmoid\n", "\n", "x_data = tensor([[1.0], [2.0], [3.0]])\n", "y_data = tensor([[2.0], [4.0], [6.0]])" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:56:49.423764Z", "start_time": "2020-08-13T03:56:49.419364Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Model(nn.Module):\n", " def __init__(self):\n", " \"\"\"\n", " In the constructor we instantiate two nn.Linear module\n", " \"\"\"\n", " super(Model, self).__init__()\n", " self.linear = torch.nn.Linear(1, 1) # One in and one out\n", "\n", " def forward(self, x):\n", " \"\"\"\n", " In the forward function we accept a Variable of input data and we must return\n", " a Variable of output data. We can use Modules defined in the constructor as\n", " well as arbitrary operators on Variables.\n", " \"\"\"\n", " y_pred = self.linear(x)\n", " return y_pred" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T03:59:36.441381Z", "start_time": "2020-08-13T03:59:36.435510Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# our model\n", "model = Model()\n", "# Construct our loss function and an Optimizer. The call to model.parameters()\n", "# in the SGD constructor will contain the learnable parameters of the two\n", "# nn.Linear modules which are members of the model.\n", "criterion = torch.nn.MSELoss(reduction='sum')\n", "optimizer = torch.optim.SGD(model.parameters(), lr=0.01)\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T04:00:31.825433Z", "start_time": "2020-08-13T04:00:31.721167Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 0 | Loss: 80.97074890136719 \n", "Epoch: 100 | Loss: 0.2520463764667511 \n", "Epoch: 200 | Loss: 0.059265460819005966 \n", "Epoch: 300 | Loss: 0.013935551047325134 \n", "Epoch: 400 | Loss: 0.003276776522397995 \n" ] } ], "source": [ "# Training loop\n", "for k, epoch in enumerate(range(500)):\n", " # 1) Forward pass: Compute predicted y by passing x to the model\n", " y_pred = model(x_data)\n", "\n", " # 2) Compute and print loss\n", " loss = criterion(y_pred, y_data)\n", " if k%100==0:\n", " print(f'Epoch: {epoch} | Loss: {loss.item()} ')\n", "\n", " # Zero gradients, perform a backward pass, and update the weights.\n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2020-08-13T04:01:38.716592Z", "start_time": "2020-08-13T04:01:38.711952Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prediction (after training) 4 7.967859268188477\n" ] } ], "source": [ "# After training\n", "hour_var = tensor([[4.0]])\n", "y_pred = model(hour_var)\n", "print(\"Prediction (after training)\", 4, model(hour_var).data[0][0].item())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Pytorch Rhythm\n", "\n", "![image.png](./images/nn14.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Regression\n", "\n", "Let’s start with a simple example of House Price. \n", "- Say you’re helping a friend who wants to buy a house.\n", "\n", "- She was quoted $400,000 for a 2000 sq ft house (185 meters). \n", "\n", "Is this a good price or not?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "So you ask your friends who have bought houses in that same neighborhoods, and you end up with three data points:\n", "\n", "\n", "\n", "| Area (sq ft) (x) | Price (y) | \n", "| -------------|:-------------:|\n", "|2,104|399,900|\n", "|1,600|329,900|\n", "|2,400|369,000|" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "$$y = f(X) = W X$$\n", "\n", "- Calculating the prediction is simple multiplication.\n", "- But before that, we need to think about the weight we’ll be multiplying by. \n", "- “training” a neural network just means finding the weights we use to calculate the prediction.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "A simple predictive model (“regression model”)\n", "- takes an input, \n", "- does a calculation, \n", "- and gives an output \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-04-07T03:15:14.317623Z", "start_time": "2019-04-07T03:15:14.313438Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Model Evaluation\n", "- If we apply our model to the three data points we have, how good of a job would it do?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Loss Function**\n", "\n", "how bad our prediction is\n", "\n", "- For each point, the error is measured by the difference between the **actual value** and the **predicted value**, raised to the power of 2. \n", "- This is called **Mean Square Error**. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- We can't improve much on the model by varying the weight any more. \n", "- But if we add a bias (intercept) we can find values that improve the model.\n", "\n", "\n", "\n", "$$y = 0.1 X + 150$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Gradient Descent**\n", "\n", "- Automatically get the correct weight and bias values \n", "- minimize the loss function.\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Regression\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:13:36.927633Z", "start_time": "2020-05-24T07:13:36.922702Z" }, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import torch\n", "from torch import nn, optim\n", "from torch.autograd import Variable\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:03.772993Z", "start_time": "2020-05-24T07:17:03.518421Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAD7CAYAAACPDORaAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAYPUlEQVR4nO3dfZCU5Z3u8e8MIEwCmGQYAgZRkfU30YTFinFNfEPztqm1CrMFYRNW426QGANbm8jheOpoBJMT1+jiHreKMosYrIOJZo3ZCgG2dgsTohJMRNGTEK6jBEnQSYWa3RwEURm6zx/PM6adM0w/Dd10c3t9/tG++366r5oerr7nfvqlrVwuY2ZmaWpvdgAzM2scl7yZWcJc8mZmCXPJm5klzCVvZpaw4c0OUGEk8H6gBzjU5CxmZseLYcBE4GfAqwOvbKWSfz/wSLNDmJkdpy4EHh042Eol3wPwn/+5n1Kp9tfud3aOprd3X91DHa1WzQWtm825auNctUktV3t7G29/+1sh79CBWqnkDwGUSuUjKvn+Y1tRq+aC1s3mXLVxrtokmmvQbW6feDUzS5hL3swsYS55M7OEFd6Tj4jbgXGSroqI6cDdwFjgx8A1kvoiYjKwGhgPCJgrqfXOcJiZvUkUWslHxIeAz1QMrQYWSDoDaAOuzseXA8sldQNPADfWMauZmdWoaslHxDuA/wF8Lb98CtAhaXM+ZRUwOyJGABcBD1aO1zmvmQ2wd7v4j3U/4MCO55odxVpQke2abwD/HTg5v3wSb3w9Zg8wCRgH7JXUN2C8Jp2do2s95HVdXWOO+NhGatVc0LrZnKuYvdvFL25cQqmvj/bhwznrK0sY2x3NjvW6Vvt59Xsz5Rqy5CNiHvAbSRsi4qp8uB2ofDFnG1AaZJx8vCa9vfuO6LWiXV1j2LPnpZqPa7RWzQWtm825ivuPx5+k1NcHpRKlvj56Hn+SVztPanYsoDV/XpBervb2tiEXx9VW8nOAiRGxFXgHMJqsyCdWzJkAvAj8DjgxIoZJOpTPebHmxGZWWEd00z58OKW+PtqGDacjupsdyVrMkHvykj4i6T2SpgNfBr4v6a+AVyLi/HzaFcB6SQfJPntmTj5+JbC+QbnNDOg4fSpnfWUJ4y7/cyYtWkzH6VObHclazJF+rMFcYEVEjAWeBO7Mx68F7o2IG4BfA586+ohmNpSx3dEyWzTWegqXvKRVZK+YQdLTwLmDzNkFzKhPNDMzO1p+x6uZWcJc8mZmCXPJm5klzCVvZpYwl7yZWcJc8mZmCXPJm5klzCVvZpYwl7yZWcJc8mZmCXPJm5klzCVvZpYwl7yZWcJc8mZmCXPJm5klzCVvZpYwl7yZWcJc8mZmCXPJm5klrNB3vEbEzcAsoAyslLQsIq4CFgOHgIeB6yT1RcRkYDUwHhAwV9K+RoQ3M7OhVV3JR8TFwKXANOAcYGFEBPBV4EOS3guMAP4mP2Q5sFxSN/AEcGMjgpuZWXVVS17SRuASSX1kq/PhwLnATyT15NN+AFweESOAi4AH8/FVwOx6hzYzs2IK7clLOhgRS4FtwAbgp8B5EXFyRAwj28qZAIwD9uZPCAA9wKT6xzYzsyLayuVy4ckR8RZgDfAAsB9YBBwAvgPMAz4GbJZ0cj5/OLBP0qgCN38qsLOW8GZm9rrTgOcHDlY98RoR3cAoSVslvRwRD5Ft19wq6ex8zmxgB/A74MSIGCbpEDAReLGWlL29+yiVij/x9OvqGsOePS/VfFyjtWouaN1szlUb56pNarna29vo7Bx9+OsL3MYUYEVEjIyIE4CZwA+BDRExJh9bCDwg6SDwCDAnP/ZKYH3Nqc3MrC6KnHhdB6wFngK2AJsk3QcsBTYDPwcelvSt/JBrgfkRsQ24ELihEcHNzKy6Qq+Tl7QEWDJgbCWwcpC5u4AZRx/NzMyOlt/xamaWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZgkr9B2vEXEzMAsoAyslLYuIjwK3AcOAJ4F5kl6LiMnAamA8IGCupH0NSW9mZkOqupKPiIuBS4FpwDnAwogIsi/x/gtJ7wHeAlyZH7IcWC6pG3gCuLERwc3MrLqqJS9pI3CJpD6y1flwYD/ZCn5sRAwDRgEHImIEcBHwYH74KmB2A3KbmVkBhfbkJR2MiKXANmAD8AJwLfAj4EVgHFmxjwP25k8IAD3ApDpnNjOzgtrK5XLhyRHxFmANsBH4NPBnwE5gGTAC+BqwWdLJ+fzhwD5Jowrc/Kn5bZmZWe1OA54fOFj1xGtEdAOjJG2V9HJEPES2iv+5pB35nBXAd4C/BU6MiGGSDgETyVb6hfX27qNUKv7E06+rawx79rxU83GN1qq5oHWzOVdtnKs2qeVqb2+js3P04a8vcBtTgBURMTIiTgBmkr165tyIeGc+ZybwM0kHgUeAOfn4lcD6mlObmVldFDnxug5YCzwFbAE2SbqF7FUzP4yIZ8hedbMoP+RaYH5EbAMuBG5oRHAzM6uu0OvkJS0BlgwYuxe4d5C5u4AZRx/NzMyOlt/xamaWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJcwlb2aWMJe8mVnCXPJmZglzyZuZJazQF3lHxM3ALKAMrAS2A1+rmPIu4HFJl0XEdOBuYCzwY+AaSX11TW1mZoVUXclHxMXApcA04BxgIbBD0nRJ04E/BfYCX8wPWQ0skHQG0AZc3YjgZmZWXdWSl7QRuCRfjY8nW/3vr5hyG3CXpGcj4hSgQ9Lm/LpVwOz6RjYzs6IKbddIOhgRS4FFwD8DLwBExB8BM4B5+dSTgJ6KQ3uASbUE6uwcXcv0N+jqGnPExzZSq+aC1s3mXLVxrtq8mXIVKnkASTdFxK3AGrItmH8C5gPLJb2aT2sn27fv1waUagnU27uPUqlcfeIAXV1j2LPnpZqPa7RWzQWtm825auNctWnFXAd2PEf77p2UJp1Gx+lTazq2vb1tyMVx1ZKPiG5glKStkl6OiIfI9ucBLgc+WjF9NzCx4vIE4MWaEpuZvYkc2PEcu2//OuVDfbQNG86kRYtrLvqhFHkJ5RRgRUSMjIgTgJnAoxExjmz/fWf/REm7gFci4vx86Apgfd3Smpkl5oC2U+47CKUS5UN9HND2ut5+kROv64C1wFPAFmCTpPvJyn/3IIfMBe6IiO3AaODO+sU1M0tLR3TTNnwEtLfTNmw4HdFd19sveuJ1CbBkwNhPgfMGmfs0cG4dspmZJa/j9KlMWrT4iPfkqyl84tXMzBqj4/SpdJ13dkNOCPtjDczMEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLWKGv/4uIm4FZQBlYKWlZRHwAuAMYAzwDfEbSaxExHbgbGAv8GLhGUl9D0puZ2ZCqruQj4mLgUmAacA6wMCL+GHgImC/prHzqZ/P/rgYWSDoDaAOurntqMzMrpGrJS9oIXJKvxseTrf6nAz+R9Ew+bSHwvYg4BeiQtDkfXwXMrntqMzMrpK1cLheaGBFLgUXAPwO/BM4CTgC6gceA64CzgdskXZAfMxVYl6/qqzkV2FljfjMzy5wGPD9wsNCePICkmyLiVmANWRl/DDgP+DWwErge+Heyfft+bUCplpS9vfsolYo98VTq6hrDnj0v1Xxco7VqLmjdbM5VG+eqTWq52tvb6Owcffjrq91ARHTnJ1OR9DLZXvz1wGZJOyUdAr4DnAvsBiZWHD4BeLHm1GZmVhdFXkI5BVgRESMj4gRgJjAfeF9EnJzPuQzYImkX8EpEnJ+PXwGsr3doMzMrpsiJ13XAWuApYAuwSdL/Aj4HrImI7cA7gFvyQ+YCd+Tjo4E7GxHczMyqK7QnL2kJsGTA2Fqy8h8492myrRszM2syv+PVzCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MEuaSNzNLmEvezCxhLnkzs4S55M3MElboO14j4mZgFlAGVkpaFhHfBC4A9ufTlkr6XkR8GFgGdAAPSLqhAbnNzKyAqiUfERcDlwLTgBHAtohYC5wDXCSpp2JuB3APcDHwG2BtRHxc0vpGhDczs6FV3a6RtBG4RFIfMJ7sieEAMBm4JyKeiYilEdEOnAs8K2lnPn81MLtx8c3MbCiF9uQlHYyIpcA2YAPZiv5h4K+B84ALgc8CJwE9FYf2AJPqGdjMzIortCcPIOmmiLgVWAN8SNIn+q+LiH8ErgQeJNu379cGlGoJ1Nk5upbpb9DVNeaIj22kVs0FrZvNuWrjXLV5M+UqsiffDYyStFXSyxHxEDAnInolfTef1gYcBHYDEysOnwC8WEug3t59lErl6hMH6Ooaw549L9V8XKO1ai5o3WzOVRvnqk1qudrb24ZcHBdZyU8BlkbEBWSr9JnARuAfIuJhYB8wH7gXeByIiJgK7AQ+TXYi1szMmqDIidd1wFrgKWALsEnSzcAtwGNk+/RbJX1b0ivAVcB38/HtZFs4ZmbWBIX25CUtAZYMGFsOLB9k7gbgj+uQzczMjpLf8WpmljCXvJlZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJK/QdrxFxMzALKAMrJS2ruG4BMEvSjPzyZGA1MB4QMFfSvjrnNjOzAqqu5CPiYuBSYBpwDrAwIiK/7kzg+gGHLAeWS+oGngBurGtiMzMrrGrJS9oIXCKpj2x1PhzYHxEjgW8AX+6fGxEjgIuAB/OhVcDsOmc2M7OCCu3JSzoYEUuBbcAG4AXgFuAe4FcVU8cBe/MnBIAeYFL94pqZWS0K7ckDSLopIm4F1gDzgcmSvhQRMyqmtZPt21cq1RKos3N0LdPfoKtrzBEf20itmgtaN5tz1ca5avNmytVWLg/s5DeKiG5glKSt+eUvAO8DPgC8CowGJgBrgb8EeoG3SzoUEScDGyVNKZDlVGBnb+8+SqWhMw2mq2sMe/a8VPNxjdaquaB1szlXbZyrNqnlam9v618cnwY8//9dX+A2pgArImJkRJwAzAT+TdK7JU0H5gFPSJoj6SDwCDAnP/ZKYH3Nqc3MrC6KnHhdR7ZKfwrYAmySdP8Qh1wLzI+IbcCFwA31CGpmZrUrtCcvaQmw5DDX/QiYUXF5V+VlMzNrHr/j1cwsYS55M7OEueTNzBLmkjczS5hL3swsYS55M7OEueTNzBLmkjczS5hL3swsYS55M7OEueTNzBLmkjczS5hL3swsYS55M7OEueTNzBLmkjczS5hL3swsYS55M7OEueTNzBJW6DteI+JmYBZQBlZKWhYRnwcWAG1kX/S9WFI5IqYDdwNjgR8D10jqa0h6MzMbUtWVfERcDFwKTAPOARZGRABfAs4F3gt8EPhIfshqYIGkM8ieAK5uQG4zMyugaslL2ghckq/Gx5Ot/vcDZ0raD7wNOBH4fUScAnRI2pwfvgqY3YjglQ7seI7dDz7EgR3PNfquzMyOK4X25CUdjIilwDZgA/BCPnY18CugB9gKnJT/f78eYFJ9I7/RgR3Psfv2r7Prvm+z+/avu+jNzCoU2pMHkHRTRNwKrCHbgvknSSsi4pvAN4ElZHvz5YrD2oBSLYE6O0fXMp3dG3dSPtQHpRJl+mjfvZOu886u6TYaratrTLMjHFarZnOu2jhXbd5MuaqWfER0A6MkbZX0ckQ8BPxJRPxC0mOS+iLifuDzwDeAiRWHTwBerCVQb+8+SqVy9Ym50qTTaBs2nDJ9tA0bTmnSaezZ81Itd9lQXV1jWipPpVbN5ly1ca7apJarvb1tyMVxkZX8FGBpRFxAtkqfSfaqmfvyV9L8X7JX3jwqaVdEvBIR50t6DLgCWF9z6hp0nD6VSYsW0757J6VJp9Fx+tRG3p2Z2XGlyInXdWTbME8BW4BNkr4K3AJsAp4GXgb+Pj9kLnBHRGwHRgN3NiD3G3ScPpVJs/7cBW9mNkChPXlJS8j23CvHvkG2PTNw7tNkL600M7Mm8ztezcwS5pI3M0uYS97MLGEueTOzhBV+M9QxMAyy13weqaM5tpFaNRe0bjbnqo1z1SalXBXHDBvs+rZyufgbjxrsAuCRZocwMztOXQg8OnCwlUp+JPB+ss+7OdTkLGZmx4thZJ808DPg1YFXtlLJm5lZnfnEq5lZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJc8mbmSWslT7WYFARMZbsy0kuk/R8RHwAuAMYAzwDfEbSa/m3VN0NjCX75qpr8q8mnAysBsYDAuZK2lfPXMCZwNcqrn4X8Liky5qZK/95fRS4jewNE08C8/Kf16D3HxFvA+4j+0awPcAnJf22AbmuAhaTvfHtYeC6oX4uDcx1E/DJ/OJaSYsj4sPAMqADeEDSDfncY/ZYDpYrHx8B/CvwFUk/aoVcETEf+Buyb457Avjcsf43eZhcnwcWkH3X9FpgsaRys3NVXLcAmCVpRn65Ib/7Lb2Sj4g/IXub7hn55bHAQ8B8SWfl0z6b/3c1sEDSGWQP6tX5+HJguaRusl/AG+udS9I6SdMlTQf+FNgLfLHZuXIrgb+Q9B7gLcCVVe7/q8Ajkt4NrAD+Z71zRUTk9/MhSe8FRpCVxLHO9WHgo8DZwHTgfRHxKeAesq+5fDfw/oj4eH7IMXksD5PrE/nP7UfABwcc0sxc/xX4L3mmaWSd8oUWyPVF4EtkX2D03jzfR1og1yfy684Erh9wSEN+91u65Ml++F/gD18G/hHgJ5KeyS8vBL4XEacAHZI25+OrgNn5quci4MHK8QbkqnQbcJekZ1sk1zBgbEQMA0YBB6rc/5+RrRoAvg18PJ9fz1zTyB7HnvzyD4DLm5Crh+wviNckHQR+SfZE9KyknZL6yAph9jF+LAfLNZlsQXMb8Hj/xBbINQq4VtJeSWXgfwOTWyBXCThT0n7gbcCJwO9bINfkiBhJ9q16X+6f2Mjf/ZberpE0DyBbwAAwFdgXEfcD3cBjwHVkz5Q9FYf2AJOAccDe/B9r5Xi9c5Ff/iNgBjAvHzqpBXJdS7b62wvsJPslGur+X8+c/wm7F+hi8Ce0I831NLAsIk7Ob3cWMKEJuX7R///5Y/dJ4B8Z/DE7Zo/lYXKdL+nZfOxvK6a3Uq4usu2Rq1ok18GIuBq4HfgpsBV4X7NzkX0/9j1k/x77Nex3v9VX8gMNBz4G/DeyB+utZH/ytJPtB/ZrI3smHzhOPt4o88n+3Or/kKCm5oqICcDfAe8h+wCjzWR7zUPd/8DPOm2jztkk/R+yx+37ZJ88+gzwWrNyRcRZwL+TbTv8imKPWcMfy8pc/UU6iJbIFRHvAjYAK/NzBS2RS9IKoBP4Ldn3VDc1F3AqMFnSNwdMa9jv/vFW8r8FNud/Sh8CvkO257abrMT6TSB7lvsdcGK+VUE+54hXfgVcDtxfcbnZuS4Efi5ph6QS2X7ejCr3/0Kek4gYTnaCu7eeoSJiFPBTSWdL+mB+nzuakSsizicrp+sl3cvhH7Nj+lgOkutwmp4rIrrJTqrfK+krrZArIk7Ox8hXx/eTbRM2++f1KeCsiNhKdvL3nIh4oMr9H9Xv/vFW8v9GdvLi5PzyZcAWSbuAV/ofVOAKYH2+D/YIMCcfvxJY34hgETGObK/v9T/BWiDXz4FzI+Kd+eWZwM+q3P86/nBydg7ZCZ+Ddc71VmBDRIyJiBPIzq08cKxz5b9H/wJ8WlL/k/Pj2VUxNf8H92myx+yYPZaHyTWoZueKiDFk/y5vkPT3rZKLbA/+voh4W0S0kW0JPtrsXJL+WtK78xdpzAOekDSnkb/7Lb0nP5Ck30TE54A1+WpwK7Aov3ousCJ/Bc6TwJ35+LXAvRFxA/BrsmfSRphCtkoYqGm5JP0yIm4EfhgRfcBzZFtKQ93/jcCqiPgF8Ps8f71z9UbEUrLtoxHAtyR9qwm5FpGdOFxWcb7gLrI95e/m163jDyfDjtVjOWguSXcdZn4zcz0AvBO4LiKuy8e+L+nLTc51F9ne9yagj6xA+5+EWvVxbMjvvj9P3swsYcfbdo2ZmdXAJW9mljCXvJlZwlzyZmYJc8mbmSXMJW9mljCXvJlZwlzyZmYJ+3+pAiziSgrM0QAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x_train = np.array([[2104],[1600],[2400]], dtype=np.float32)\n", "y_train = np.array([[399.900], [329.900], [369.000]], dtype=np.float32)\n", "\n", "plt.plot(x_train, y_train, 'r.')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:05.263887Z", "start_time": "2020-05-24T07:17:05.258498Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "x_train = torch.from_numpy(x_train)\n", "y_train = torch.from_numpy(y_train)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-04-07T04:47:17.479477Z", "start_time": "2019-04-07T04:47:17.468259Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "**nn.Linear**\n", "\n", "> help(nn.Linear)\n", "\n", "Applies a linear transformation to the incoming data: $y = xA^T + b$\n", "\n", "- **in_features**: size of each input sample\n", "- **out_features**: size of each output sample\n", "- **bias**: If set to False, the layer will not learn an additive bias. Default: ``True``" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:06.464409Z", "start_time": "2020-05-24T07:17:06.458690Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Linear Regression Model\n", "class LinearRegression(nn.Module):\n", " def __init__(self):\n", " super(LinearRegression, self).__init__()\n", " self.linear = nn.Linear(1, 1) # input and output is 1 dimension\n", "\n", " def forward(self, x):\n", " out = self.linear(x)\n", " return out\n", "\n", "model = LinearRegression()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:07.149424Z", "start_time": "2020-05-24T07:17:07.146247Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Define Loss and Optimizatioin function\n", "criterion = nn.MSELoss()\n", "optimizer = optim.SGD(model.parameters(), lr=1e-9)#1e-4)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-06-20T13:07:42.012941Z", "start_time": "2019-06-20T13:07:42.003484Z" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "> help(nn.MSELoss)\n", "\n", "To measures the **mean squared error** (squared L2 norm) between each element in the input `x` and target `y`.\n", "\n", "> help(optim.SGD)\n", "\n", "Implements **stochastic gradient descent** (optionally with momentum).\n", "\n", "Momentum is a variation on stochastic gradient descent that takes previous updates into account as well and generally leads to faster training." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:08.856804Z", "start_time": "2020-05-24T07:17:08.642890Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch[50/1000], loss: 524703.812500\n", "Epoch[100/1000], loss: 224658.125000\n", "Epoch[150/1000], loss: 96851.820312\n", "Epoch[200/1000], loss: 42411.964844\n", "Epoch[250/1000], loss: 19222.966797\n", "Epoch[300/1000], loss: 9345.485352\n", "Epoch[350/1000], loss: 5138.111816\n", "Epoch[400/1000], loss: 3345.956299\n", "Epoch[450/1000], loss: 2582.575439\n", "Epoch[500/1000], loss: 2257.412354\n", "Epoch[550/1000], loss: 2118.905518\n", "Epoch[600/1000], loss: 2059.905518\n", "Epoch[650/1000], loss: 2034.777222\n", "Epoch[700/1000], loss: 2024.072266\n", "Epoch[750/1000], loss: 2019.512207\n", "Epoch[800/1000], loss: 2017.569946\n", "Epoch[850/1000], loss: 2016.742554\n", "Epoch[900/1000], loss: 2016.390259\n", "Epoch[950/1000], loss: 2016.241577\n", "Epoch[1000/1000], loss: 2016.177734\n" ] } ], "source": [ "num_epochs = 1000\n", "for epoch in range(num_epochs):\n", " inputs = Variable(x_train)\n", " target = Variable(y_train) \n", " # forward\n", " out = model(inputs)\n", " loss = criterion(out, target)\n", " # backward\n", " optimizer.zero_grad() # Clears the gradients of all optimized\n", " loss.backward()\n", " optimizer.step() # Performs a single optimization step.\n", "\n", " if (epoch+1) % 50 == 0:\n", " print('Epoch[{}/{}], loss: {:.6f}'\n", " .format(epoch+1, num_epochs, loss.data.item()))\n" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:17:10.284215Z", "start_time": "2020-05-24T07:17:10.002466Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "predict = model(Variable(x_train))\n", "predict = predict.data.numpy()\n", "plt.plot(x_train.numpy(), y_train.numpy(), 'ro', label='Original data')\n", "plt.plot(x_train.numpy(), predict, 'b-s', label='Fitting Line')\n", "plt.xlabel('X', fontsize= 20)\n", "plt.ylabel('y', fontsize= 20)\n", "plt.legend( fontsize= 20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Have a try" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:28:06.556729Z", "start_time": "2020-05-24T07:28:06.551348Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],\n", " [9.779], [6.182], [7.59], [2.167], [7.042],\n", " [10.791], [5.313], [7.997], [3.1]], dtype=np.float32)\n", "\n", "y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],\n", " [3.366], [2.596], [2.53], [1.221], [2.827],\n", " [3.465], [1.65], [2.904], [1.3]], dtype=np.float32)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Classification\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Activation Function\n" ] }, { "cell_type": "code", "execution_count": 186, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:55:28.328450Z", "start_time": "2020-05-24T12:55:28.098083Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def sigmoid(x):\n", " return 1/(1 + np.exp(-x))\n", "\n", "plt.plot(range(-10, 10), [sigmoid(i) for i in range(-10, 10)])\n", "plt.xlabel('x', fontsize = 20)\n", "plt.ylabel('sigmoid', fontsize = 20);" ] }, { "cell_type": "code", "execution_count": 187, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:55:36.324672Z", "start_time": "2020-05-24T12:55:36.088573Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Naive scalar relu implementation. \n", "# In the real world, most calculations are done on vectors\n", "def relu(x):\n", " if x < 0:\n", " return 0\n", " else:\n", " return x\n", "\n", "\n", "plt.plot(range(-10, 10), [relu(i) for i in range(-10, 10)])\n", "plt.xlabel('x', fontsize = 20)\n", "plt.ylabel('relu', fontsize = 20);" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Softmax\n", "\n", "The softmax function, also known as softargmax or normalized exponential function, is a function that takes as input a vector of K real numbers, and normalizes it into a probability distribution consisting of K probabilities. \n", "\n", "$$softmax = \\frac{e^x}{\\sum e^x}$$\n" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:32:40.058795Z", "start_time": "2020-05-24T07:32:40.052704Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([0.02364054, 0.06426166, 0.1746813 , 0.474833 , 0.02364054,\n", " 0.06426166, 0.1746813 ])" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def softmax(s):\n", " return np.exp(s) / np.sum(np.exp(s), axis=0)\n", "\n", "softmax([1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0])" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T07:32:41.063124Z", "start_time": "2020-05-24T07:32:40.843218Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(range(10), softmax(range(10)))\n", "plt.xlabel('x', fontsize = 20)\n", "plt.ylabel('softmax', fontsize = 20);" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Softmax is often used in neural networks, to map the non-normalized output of a network to a probability distribution over predicted output classes.\n", "\n", "- Prior to applying softmax, some vector components could be negative, or greater than one; and might not sum to 1;\n", "- After applying softmax, each component will be in the interval (0,1), and the components will add up to 1, so that they can be interpreted as probabilities. Furthermore, the larger input components will correspond to larger probabilities. \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Logistic Regression \n", "\n", "![image.png](./images/nn15.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn16.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn17.png)\n" ] }, { "cell_type": "code", "execution_count": 188, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:56:55.850226Z", "start_time": "2020-05-24T12:56:55.846210Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from torch import tensor\n", "from torch import nn\n", "from torch import sigmoid\n", "import torch.nn.functional as F\n", "import torch.optim as optim\n", "\n", "# Training data and ground truth\n", "x_data = tensor([[1.0], [2.0], [3.0], [4.0]])\n", "y_data = tensor([[0.], [0.], [1.], [1.]])" ] }, { "cell_type": "code", "execution_count": 189, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:57:02.236442Z", "start_time": "2020-05-24T12:57:02.232697Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Model(nn.Module):\n", " def __init__(self):\n", " \"\"\"\n", " In the constructor we instantiate nn.Linear module\n", " \"\"\"\n", " super(Model, self).__init__()\n", " self.linear = nn.Linear(1, 1) # One in and one out\n", "\n", " def forward(self, x):\n", " \"\"\"\n", " In the forward function we accept a Variable of input data and we must return\n", " a Variable of output data.\n", " \"\"\"\n", " y_pred = sigmoid(self.linear(x))\n", " return y_pred" ] }, { "cell_type": "code", "execution_count": 190, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:57:13.366861Z", "start_time": "2020-05-24T12:57:13.363187Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# our model\n", "model = Model()\n", "\n", "# Construct our loss function and an Optimizer. The call to model.parameters()\n", "# in the SGD constructor will contain the learnable parameters of the two\n", "# nn.Linear modules which are members of the model.\n", "criterion = nn.BCELoss(reduction='mean')\n", "optimizer = optim.SGD(model.parameters(), lr=0.01)" ] }, { "cell_type": "code", "execution_count": 192, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T12:57:54.191754Z", "start_time": "2020-05-24T12:57:53.997444Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/1000 | Loss: 0.3840\n", "Epoch 101/1000 | Loss: 0.3748\n", "Epoch 201/1000 | Loss: 0.3661\n", "Epoch 301/1000 | Loss: 0.3579\n", "Epoch 401/1000 | Loss: 0.3501\n", "Epoch 501/1000 | Loss: 0.3427\n", "Epoch 601/1000 | Loss: 0.3357\n", "Epoch 701/1000 | Loss: 0.3290\n", "Epoch 801/1000 | Loss: 0.3226\n", "Epoch 901/1000 | Loss: 0.3166\n" ] } ], "source": [ "# Training loop\n", "for k, epoch in enumerate(range(1000)):\n", " # Forward pass: Compute predicted y by passing x to the model\n", " y_pred = model(x_data)\n", "\n", " # Compute and print loss\n", " loss = criterion(y_pred, y_data)\n", " if k%100==0:\n", " print(f'Epoch {epoch + 1}/1000 | Loss: {loss.item():.4f}')\n", "\n", " # Zero gradients, perform a backward pass, and update the weights.\n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()" ] }, { "cell_type": "code", "execution_count": 198, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T13:01:50.071575Z", "start_time": "2020-05-24T13:01:50.066628Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Let's predict the hours need to score above 50%\n", "==================================================\n", "Prediction for x = 1.0, y_pred = 0.1998 | Above 50%: False\n", "Prediction for x = 7.0, y_pred = 0.9969 | Above 50%: True\n" ] } ], "source": [ "# After training\n", "print(f'Let\\'s predict the hours need to score above 50%\\n{\"=\" * 50}')\n", "y_pred = model(tensor([[1.0]]))\n", "print(f'Prediction for x = 1.0, y_pred = {y_pred.item():.4f} | Above 50%: {y_pred.item() > 0.5}')\n", "y_pred = model(tensor([[7.0]]))\n", "print(f'Prediction for x = 7.0, y_pred = {y_pred.item():.4f} | Above 50%: { y_pred.item() > 0.5}')\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Diabetes Classification\n", "\n", "![image.png](./images/nn18.png)\n" ] }, { "cell_type": "code", "execution_count": 199, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T13:18:56.152030Z", "start_time": "2020-05-24T13:18:56.137068Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "X's shape: torch.Size([759, 8]) | Y's shape: torch.Size([759, 1])\n" ] } ], "source": [ "from torch import nn, optim, from_numpy\n", "import numpy as np\n", "\n", "xy = np.loadtxt('../data/diabetes.csv.gz', delimiter=',', dtype=np.float32)\n", "x_data = from_numpy(xy[:, 0:-1])\n", "y_data = from_numpy(xy[:, [-1]])\n", "print(f'X\\'s shape: {x_data.shape} | Y\\'s shape: {y_data.shape}')\n" ] }, { "cell_type": "code", "execution_count": 200, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T13:19:17.672598Z", "start_time": "2020-05-24T13:19:17.667176Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Model(nn.Module):\n", " def __init__(self):\n", " \"\"\"\n", " In the constructor we instantiate two nn.Linear module\n", " \"\"\"\n", " super(Model, self).__init__()\n", " self.l1 = nn.Linear(8, 6)\n", " self.l2 = nn.Linear(6, 4)\n", " self.l3 = nn.Linear(4, 1)\n", " self.sigmoid = nn.Sigmoid()\n", "\n", " def forward(self, x):\n", " \"\"\"\n", " In the forward function we accept a Variable of input data and we must return\n", " a Variable of output data. We can use Modules defined in the constructor as\n", " well as arbitrary operators on Variables.\n", " \"\"\"\n", " x = self.sigmoid(self.l1(x))\n", " x = self.sigmoid(self.l2(x))\n", " y_pred = self.sigmoid(self.l3(x))\n", " return y_pred" ] }, { "cell_type": "code", "execution_count": 201, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T13:20:09.530806Z", "start_time": "2020-05-24T13:20:09.526832Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# our model\n", "model = Model()\n", "\n", "\n", "# Construct our loss function and an Optimizer. The call to model.parameters()\n", "# in the SGD constructor will contain the learnable parameters of the two\n", "# nn.Linear modules which are members of the model.\n", "criterion = nn.BCELoss(reduction='mean')\n", "optimizer = optim.SGD(model.parameters(), lr=0.1)\n" ] }, { "cell_type": "code", "execution_count": 207, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T13:22:13.307900Z", "start_time": "2020-05-24T13:22:12.669459Z" }, "code_folding": [], "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 1/1000 | Loss: 0.6444\n", "Epoch: 201/1000 | Loss: 0.6441\n", "Epoch: 401/1000 | Loss: 0.6437\n", "Epoch: 601/1000 | Loss: 0.6431\n", "Epoch: 801/1000 | Loss: 0.6424\n", "Epoch: 1000/1000 | Loss: 0.6413\n" ] } ], "source": [ "# Training loop\n", "for k, epoch in enumerate(range(1000)):\n", " # Forward pass: Compute predicted y by passing x to the model\n", " y_pred = model(x_data)\n", "\n", " # Compute and print loss\n", " loss = criterion(y_pred, y_data)\n", " if k % 200 ==0:\n", " print(f'Epoch: {epoch + 1}/1000 | Loss: {loss.item():.4f}')\n", "\n", " # Zero gradients, perform a backward pass, and update the weights.\n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()\n", "print(f'Epoch: {epoch + 1}/1000 | Loss: {loss.item():.4f}')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "![image.png](./images/nn19.png)\n", "\n", "The images in CIFAR-10 are of size 3x32x32, i.e. 3-channel color images of 32x32 pixels in size. http://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html \n" ] }, { "cell_type": "code", "execution_count": 214, "metadata": { "ExecuteTime": { "end_time": "2020-05-24T14:21:11.794120Z", "start_time": "2020-05-24T14:21:11.787466Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "class Net(nn.Module):\n", " def __init__(self):\n", " super(Net, self).__init__()\n", " self.conv1 = nn.Conv2d(3, 6, 5) # in_channels = 3, out_channels = 6, kernel_size= 5\n", " self.pool = nn.MaxPool2d(2, 2) # pool of square window of size = 2, stride = 2\n", " self.conv2 = nn.Conv2d(6, 16, 5) # in_channels = 6, out_channels = 16, kernel_size= 5\n", " self.fc1 = nn.Linear(16 * 5 * 5, 120) # in_features = 16*5*5, out_features = 120\n", " self.fc2 = nn.Linear(120, 84) # in_features = 120, out_features = 84\n", " self.fc3 = nn.Linear(84, 10) # in_features = 84, out_features = 10\n", "\n", " def forward(self, x):\n", " x = self.pool(F.relu(self.conv1(x)))\n", " x = self.pool(F.relu(self.conv2(x)))\n", " x = x.view(-1, 16 * 5 * 5) # Flatten the data (n, 16, 5, 5)-> (n, 400)\n", " x = F.relu(self.fc1(x))\n", " x = F.relu(self.fc2(x))\n", " x = self.fc3(x)\n", " return x" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Run in Google Colab\n", "\n", "![image.png](./images/nn20.png)\n", "\n", "https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/cifar10_tutorial.ipynb" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "![image.png](./images/nn21.png)\n", "\n", "深度学习 Deep Learning 视频系列 https://space.bilibili.com/88461692/channel/detail?cid=26587" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "\n", "![image.png](./images/end.png)\n" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "170px" }, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }