{ "cells": [ { "cell_type": "markdown", "id": "6acf32cc-e720-4ff6-9905-bcacb3030aaa", "metadata": {}, "source": [ "# Notebook 5: PyTorch [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mattsankner/micrograd/blob/main/mg5_pytorch.ipynb) [![View in nbviewer](https://img.shields.io/badge/view-nbviewer-orange)](https://nbviewer.jupyter.org/github/mattsankner/micrograd/blob/main/mg5_pytorch.ipynb)" ] }, { "cell_type": "markdown", "id": "a10e6c35-868b-47d8-83e0-e741889cbcee", "metadata": {}, "source": [ "### Now, we will build the same forward and backward pass, but all with PyTorch!\n", "\n", "Engineers typically use a modern deep neural network library like PyTorch in production We can do the same thing with the PyTorch API. " ] }, { "cell_type": "code", "execution_count": 26, "id": "f20503d7-0738-44f1-b736-bd13b1875d17", "metadata": {}, "outputs": [], "source": [ "import math\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 27, "id": "2b98f012-9987-4ad4-8d93-2005d63e4cc2", "metadata": {}, "outputs": [], "source": [ "class Value:\n", " def __init__(self, data, _children=(), _op='', label=''):\n", " self.data = data\n", " self.grad = 0.0\n", " self._backward = lambda: None\n", " self._prev = set(_children)\n", " self._op = _op\n", " self.label = label\n", "\n", " def __repr__(self):\n", " return f\"Value(data={self.data})\"\n", " \n", " def __add__(self, other):\n", " other = other if isinstance(other, Value) else Value(other)\n", " out = Value(self.data + other.data, (self, other), '+')\n", " \n", " def _backward():\n", " self.grad += 1.0 * out.grad\n", " other.grad += 1.0 * out.grad\n", " out._backward = _backward\n", " \n", " return out\n", "\n", " def __mul__(self, other):\n", " other = other if isinstance(other, Value) else Value(other)\n", " out = Value(self.data * other.data, (self, other), '*')\n", " \n", " def _backward():\n", " self.grad += other.data * out.grad\n", " other.grad += self.data * out.grad\n", " out._backward = _backward\n", " \n", " return out\n", "\n", " def __neg__(self): #-self\n", " return self * -1\n", "\n", " def __sub__(self, other): #self-other; implement thru addition by negation, mult by -1 for the negation (what we've built)\n", " return self + (-other)\n", " \n", " def __pow__(self, other): #self to the pow of other\n", " assert isinstance(other, (int, float)), \"only supporting int/float powers for now\"\n", " out = Value(self.data**other, (self,),f'**{other}')\n", "\n", " def _backward(): #what's the chain rule for backprop thru the power function, where power is power of some kind of constant\n", " self.grad += other * self.data ** (other -1) * out.grad\n", " #other * self.data ** (other -1) is the local derivative only, but then have to chain it by mult by out.grad\n", " #self.data is an int or a float, not a Value obj, just accessing .data prop\n", " #to do the above exercises, go to the derivative rules\n", " out._backward = _backward\n", " return out\n", "\n", " def __rmul__(self,other): #other * self; fallback for python not being able to do num * self, check if rmul in value, call it reverse\n", " return self * other\n", "\n", " def __truediv__(self, other): #self/other\n", " return self*other**-1\n", " \n", " def tanh(self):\n", " x = self.data\n", " t = (math.exp(2*x) - 1)/(math.exp(2*x) + 1)\n", " out = Value(t, (self, ), 'tanh')\n", " \n", " def _backward():\n", " self.grad += (1 - t**2) * out.grad\n", " out._backward = _backward\n", " \n", " return out\n", "\n", " def exp(self): #mirrors tanh; inputs, transforms, and outputs a single scalar value\n", " x = self.data\n", " out = Value(math.exp(x), (self, ), 'exp')\n", "\n", " #how do you backpropogate through e^x We need to know the local deriv of e^x. D/dx of e^x is e^x\n", " #eturns E raised to the power of x (Ex).\n", " #'E' is the base of the natural system of logarithms (approximately 2.718282) and x is the number passed to it.\n", " def _backward():\n", " self.grad += out.data * out.grad\n", " out._backward = _backward\n", " return out\n", " \n", " def backward(self):\n", " topo = []\n", " visited = set()\n", " def build_topo(v):\n", " if v not in visited:\n", " visited.add(v)\n", " for child in v._prev:\n", " build_topo(child)\n", " topo.append(v)\n", " build_topo(self)\n", " \n", " self.grad = 1.0\n", " for node in reversed(topo):\n", " node._backward()" ] }, { "cell_type": "code", "execution_count": 28, "id": "822945fc-908e-460d-b4a1-edadd3dbdd97", "metadata": {}, "outputs": [], "source": [ "from graphviz import Digraph\n", "\n", "def trace(root):\n", " # builds a set of all nodes and edges in a graph\n", " nodes, edges = set(), set()\n", " def build(v):\n", " if v not in nodes:\n", " nodes.add(v)\n", " for child in v._prev:\n", " edges.add((child, v))\n", " build(child)\n", " build(root)\n", " return nodes, edges\n", "\n", "def draw_dot(root):\n", " dot = Digraph(format='svg', graph_attr={'rankdir': 'LR'}) # LR = left to right\n", " \n", " nodes, edges = trace(root)\n", " for n in nodes:\n", " uid = str(id(n))\n", " # for any value in the graph, create a rectangular ('record') node for it\n", " dot.node(name = uid, label = \"{ %s | data %.4f | grad %.4f }\" % (n.label, n.data, n.grad), shape='record')\n", " if n._op:\n", " # if this value is a result of some operation, create an op node for it\n", " dot.node(name = uid + n._op, label = n._op)\n", " # and connect this node to it\n", " dot.edge(uid + n._op, uid)\n", "\n", " for n1, n2 in edges:\n", " # connect n1 to the op node of n2\n", " dot.edge(str(id(n1)), str(id(n2)) + n2._op)\n", "\n", " return dot" ] }, { "cell_type": "markdown", "id": "06684d09-ebf9-42ea-93da-5871b0ad525e", "metadata": {}, "source": [ "### The micrograd we have already built is a scalar valued engine, which means it can only take in scalar values, like Value(2.0). In PyTorch, everything is based around tensors, which are n-dimensional arrays of scalars. Thus, we need a scalar valued tensor. " ] }, { "cell_type": "code", "execution_count": 29, "id": "d1806d87-fd5b-4b5c-9e1f-e8106d59c4af", "metadata": {}, "outputs": [], "source": [ "import torch " ] }, { "cell_type": "code", "execution_count": 37, "id": "bbad52f4-91b4-4b46-b523-fa74442459be", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([[1., 2., 3.],\n", " [4., 5., 6.]])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#This is a tensor:\n", "torch.Tensor([[1,2,3],[4,5,6]])" ] }, { "cell_type": "code", "execution_count": 38, "id": "6754ec42-6d35-4156-a24d-03ba2e1d6c21", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.Size([2, 3])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Check its shape:\n", "torch.Tensor([[1,2,3],[4,5,6]]).shape" ] }, { "cell_type": "markdown", "id": "c6286390-83a4-4fe7-9e00-caaa2c6be162", "metadata": {}, "source": [ "By default, python by default uses double precision for its floating points, which is byte size float(64). You can cast a tensor to double so it matches what python is expecting:" ] }, { "cell_type": "code", "execution_count": 42, "id": "e7d73e61-bbaf-40bb-832a-c834670913f3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.float32" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#single precision\n", "torch.Tensor([2.0]).dtype" ] }, { "cell_type": "code", "execution_count": 43, "id": "ef28c09a-b7ff-4a56-8358-6da7e9b4a3bd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "torch.float64" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#casted to double precision so float 32 (default) is cast to float 64\n", "torch.Tensor([2.0]).double().dtype" ] }, { "cell_type": "markdown", "id": "b027793c-e9fd-4c33-bac8-8b4d5f656dc7", "metadata": {}, "source": [ "PyTorch automatically assumes that leaf nodes we declare don't require gradients. By default, requires_grad is set to false for efficiency because you wouldn't usually want gradients for leaf nodes as input to the network. We can explicitly say all nodes require gradients, though. \n", "\n", "Now, we will construct scalar valued, one element tensors. Once we have defined all values, we can perform arithmetic on the tensors." ] }, { "cell_type": "code", "execution_count": 46, "id": "a22f4600-3cd1-46cd-92a8-89389236d1cb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.7071066904050358\n", "---\n", "x2 0.5000001283844369\n", "w2 0.0\n", "x1 -1.5000003851533106\n", "w1 1.0000002567688737\n" ] } ], "source": [ "#just like in micrograd, these tensor objects have a .data and a .grad...\n", "x1 = torch.Tensor([2.0]).double() ; x1.requires_grad = True\n", "x2 = torch.Tensor([0.0]).double() ; x2.requires_grad = True\n", "w1 = torch.Tensor([-3.0]).double() ; w1.requires_grad = True\n", "w2 = torch.Tensor([1.0]).double() ; w2.requires_grad = True\n", "b = torch.Tensor([6.8813735870195432]).double() ; b.requires_grad = True\n", "n = x1*w1 + x2*w2 + b\n", "o = torch.tanh(n)\n", "\n", "print(o.data.item()) #.item() takes a single tensor of one element and returns element, stripping out the tensor\n", "\n", "o.backward() #prints forward pass.\n", "\n", "#prints gradients/backwards pass\n", "print('---')\n", "print('x2', x2.grad.item())\n", "print('w2', w2.grad.item())\n", "print('x1', x1.grad.item())\n", "print('w1', w1.grad.item())" ] }, { "cell_type": "markdown", "id": "653512ba-f606-41ba-8f71-7ea2235216fd", "metadata": {}, "source": [ "PyTorch can do what we did in micrograd, as a special case when your tensors are all single-element tensors. \n", "\n", "With PyTorch, everything is much more efficient because we're working with tensor objects. Many operations can work in parallel on these tensors. Everything we've built agrees with API of PyTorch." ] }, { "cell_type": "code", "execution_count": 47, "id": "7acddc98-1af1-45a2-9a1b-817f54a6e801", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "tensor([0.7071], dtype=torch.float64, grad_fn=)" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "o #tensor object, has a backward function just like what we implemented" ] }, { "cell_type": "code", "execution_count": 49, "id": "fe7b8601-8187-4684-890a-37fa4021b93d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7071066904050358" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "o.item() #same as o.data.item()" ] }, { "cell_type": "code", "execution_count": 50, "id": "1362dda5-6d33-4205-8b42-0ef2b318d374", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-1.5000003851533106" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#all of the variables have a .grad\n", "x1.grad.item() #-> grad is a tensor, pop out number with .item()" ] }, { "cell_type": "markdown", "id": "65432123-430b-4174-84f7-2c432ddec9fe", "metadata": {}, "source": [ "## Now that we have some machinery to build pretty complicated mathematical expressions in a neuron, we can begin building out are neural network.\n", "\n", "Neural networks are a specific class of mathematic expressions. We will start building out a neural netwrok piece by piece, and eventually build out a two layer ```Multi-Layer Perceptron```. \n", "\n", "Let's start with a single individual neuron. We'll make our above neuron subscribe to PyTorch's API and its specific neural network modules.\n", "\n", "![](mlp.jpeg)" ] }, { "cell_type": "markdown", "id": "02f1beb6-d9f8-467f-95ad-f863d348eb4e", "metadata": {}, "source": [ "Now, we're going to define a layer on neurons... So look up schematic for MLP. MultiLayer Perceptron.\n", "Notice how there are multiple layers with multiple neurons. The neurons are not connected to each other, but are rather connected to all of the inputs. A layer of neurons is a set of neurons evaluated independently. " ] }, { "cell_type": "markdown", "id": "5559faf0-a5a8-4216-a67c-d5ef998cc87e", "metadata": {}, "source": [ "### Below, we initialize the Neuron class. \n", "We define a constructor and ```__call__(self,x)```.\n", "\n", "For the forward pass, we need to multiply all of the elements of ```w``` with all of the elements of ```x```, pair-wise.\n", "\n", "To visualize this, we write: ```list(zip(self.w,x))``` after we have intialized our ```x's``` and ```n```, where ```x's``` are the ```self.data```'s, and ```n``` is the Neuron we're putting them in.\n", "\n", "The ```__call__``` function will give us a ```zip``` that will make two iterators that iterate over tuples of corresponding entries ```(self.w[3], x[3])```, for example.\n" ] }, { "cell_type": "code", "execution_count": 105, "id": "4fc028d3-13af-48c0-9d6c-3f498488d1cd", "metadata": {}, "outputs": [], "source": [ "import random" ] }, { "cell_type": "code", "execution_count": 108, "id": "866f88c8-ab9b-4efb-9c13-9bc5e0d2fc1e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[(Value(data=-0.891937987300732), 2.0), (Value(data=-0.7191340580027186), 3.0)]\n" ] } ], "source": [ "class Neuron:\n", " #constructor takes number of inputs to the neuron, \n", " #creates a weight of some random number between -1 and 1,\n", " #and a bias that controls trigger happinesss of the neuron\n", " def __init__(self,nin): \n", " self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]\n", " self.b = Value(random.uniform(-1,1))\n", "\n", " #w * x + b -> w*x is a dot product. \n", " def __call__(self,x): #this will be called with n(x). \n", " print(list(zip(self.w,x)))\n", "\n", "x = [2.0, 3.0]\n", "n = Neuron(2) #initialize a 2 dimensional neuron\n", "n(x) #feed the nums into the neuron" ] }, { "cell_type": "markdown", "id": "ee4d57ec-720d-4e89-af35-cb2006d8caba", "metadata": {}, "source": [ "Now that you can visualize it, we'll create the raw activation function, which is a sum of the dot products of all of the tuples of the weights and data. After we create that, we need to pass it through the non-linearity, so we call ```tanh()``` on it.\n", "\n", "Test the code below. Notice how you get a different answer each time because we initialize different weights and biases each time." ] }, { "cell_type": "code", "execution_count": 115, "id": "d0360f85-9c52-432d-91be-ffa4fcfac73f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Value(data=0.9702820556902484)" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Neuron:\n", " def __init__(self,nin): \n", " self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]\n", " self.b = Value(random.uniform(-1,1))\n", "\n", " def __call__(self,x): \n", "\n", " #It computes the weighted sum of the inputs plus the bias (act), \n", " #using the dot product of weights self.w and inputs x, starting the sum with self.b.\n", " \n", " #create raw activation function:\n", " #sum the product for all elements of w and all elements of x (pairs)\n", " #by default, builds a sum on top of 0.0, so we start with self.b instead (optional param)\n", " act = sum((wi*xi for wi, xi in zip(self.w, x)), self.b)\n", " \n", " #pass through non-linearity\n", " out = act.tanh()\n", " \n", " return out\n", "\n", "x = [2.0, 3.0]\n", "n = Neuron(2) #initialize a 2 dimensional neuron \n", "n(x) #feed the nums into the neuron " ] }, { "cell_type": "markdown", "id": "ee502927-50f3-426e-8b29-accf99a55245", "metadata": {}, "source": [ "### Now, we'll define a layer of neurons, which will contain all of the neurons connected to the previous set of neuron(s) as input and connected to the next set of neuron(s) as output.\n", "\n" ] }, { "cell_type": "code", "execution_count": 117, "id": "be2d7319-da97-44ff-bebe-4f711cd51e57", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[Value(data=-0.15572956132854174),\n", " Value(data=0.9984237412005323),\n", " Value(data=-0.9748622220216048)]" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Neuron:\n", " def __init__(self,nin): \n", " self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]\n", " self.b = Value(random.uniform(-1,1))\n", "\n", " def __call__(self,x): \n", " act = sum((wi*xi for wi, xi in zip(self.w, x)), self.b) \n", " out = act.tanh()\n", " return out\n", "\n", "class Layer: #A layer is a list of neurons, one one layer in an MLP\n", "\n", " #nin = num inputs to each each neuron in the layer\n", " #nout = number of neurons in the layer\n", "\n", " #init a list of self.neurons containing nout Neuron objects, each with nin inputs\n", " def __init__(self, nin, nout): \n", " self.neurons = [Neuron(nin) for _ in range(nout)]\n", "\n", " def __call__(self, x):\n", " outs = [n(x) for n in self.neurons] #apply x to each neuron in the layer\n", " \n", " #returns output of each neuron as a list. If only one neuron, returns single output\n", " return outs[0] if len(outs) == 1 else outs \n", " \n", "x = [2.0, 3.0]\n", "n = Layer(2, 3) #-> two dimensional neurons, 3 of them\n", "n(x) #feed x to Layer(n)" ] }, { "cell_type": "markdown", "id": "fa3c8a49-e8be-41be-8c2b-8cd994b5eceb", "metadata": {}, "source": [ "### Now, let's define our MLP. This will encapsulate all of the layers of the neurons in our neural network.\n", "\n", "It will take a list of ```nouts``` (instead of single ```nout```), which defines the sizes of all layers we want in our MLP\n", "\n" ] }, { "cell_type": "code", "execution_count": 120, "id": "213782e6-1496-4a73-9e6f-9e39962bd40a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Value(data=-0.6707963942601958)" ] }, "execution_count": 120, "metadata": {}, "output_type": "execute_result" } ], "source": [ "class Neuron:\n", " def __init__(self,nin): \n", " self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]\n", " self.b = Value(random.uniform(-1,1))\n", "\n", " def __call__(self,x): \n", " act = sum((wi*xi for wi, xi in zip(self.w, x)), self.b) \n", " out = act.tanh()\n", " return out\n", "\n", "class Layer: \n", " def __init__(self, nin, nout): \n", " #nin = num inputs to each each neuron in the layer\n", " #nout = number of neurons in the layer\n", " self.neurons = [Neuron(nin) for _ in range(nout)]\n", "\n", " def __call__(self, x):\n", " outs = [n(x) for n in self.neurons]\n", " return outs[0] if len(outs) == 1 else outs \n", "\n", "class MLP: #sample input: nin = 3; nouts = [4,4,1]; x = [2.0, 3.0, -1.0]\n", " \n", " #nin = num inputs to MLP\n", " #nouts = list where each element is number of neurons in each subsequent layer of MLP\n", " #sz combines nin and nouts into a single list, where:\n", " #sz[1] is the num of neurons in the first layer, \n", " #sz[2] is num of neurons in second layer, etc.\n", " \n", " def __init__(self, nin, nouts): \n", " \n", " sz = [nin] + nouts #sample input: sz = [3] + [4,4,1] = [3,4,4,1]\n", "\n", " #uses list comprehension to create list objects.\n", " #pairs consecutive elements in sz, creating Layer objects from input size to the \n", " #number of neurons, then from one layer's neuron to the next\n", " \n", " #iterate over consecutive pairs of these sizes and create layer objects for them\n", " self.layers = [Layer(sz[i], sz[i+1]) for i in range(len(nouts))]\n", " #sample input: for each i in range(3), create Layer object with sz[i] = nin and sz[i+1] = nout\n", " #creates self.Layers = [Layer(3,4), Layer(4,4), Layer(4,1)].\n", " #ex: Layer(3,4) -> list of 4 Neuron obj with 3 inputs each (nin=3)\n", " \n", " def __call__(self, x):\n", " for layer in self.layers: #input x is passed through each layer in sequence\n", " x = layer(x) #each layer processes input and produces output, which is input for next layer\n", " return x #return result of last layer's processing\n", " \n", "\n", "x = [2.0, 3.0, -1.0] #three dimensional input\n", "n = MLP(3, [4, 4, 1]) #three inputs into two layers of four and one output\n", "n(x)" ] }, { "cell_type": "code", "execution_count": 123, "id": "d9c55ba1-8da7-456d-bf69-29a6ae702ec9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.\n" ] }, { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "4948844560\n", "\n", " \n", "\n", "data 0.9585\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948849680+\n", "\n", "+\n", "\n", "\n", "\n", "4948844560->4948849680+\n", "\n", "\n", "\n", "\n", "\n", "4948844560*\n", "\n", "*\n", "\n", "\n", "\n", "4948844560*->4948844560\n", "\n", "\n", "\n", "\n", "\n", "4948779216\n", "\n", " \n", "\n", "data 0.8697\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934724176*\n", "\n", "*\n", "\n", "\n", "\n", "4948779216->4934724176*\n", "\n", "\n", "\n", "\n", "\n", "4949344528\n", "\n", " \n", "\n", "data 3.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949355984*\n", "\n", "*\n", "\n", "\n", "\n", "4949344528->4949355984*\n", "\n", "\n", "\n", "\n", "\n", "4948713744\n", "\n", " \n", "\n", "data 0.0011\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949357648+\n", "\n", "+\n", "\n", "\n", "\n", "4948713744->4949357648+\n", "\n", "\n", "\n", "\n", "\n", "4948844816\n", "\n", " \n", "\n", "data 0.4101\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934717328*\n", "\n", "*\n", "\n", "\n", "\n", "4948844816->4934717328*\n", "\n", "\n", "\n", "\n", "\n", "4948844816tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4948844816tanh->4948844816\n", "\n", "\n", "\n", "\n", "\n", "4949098832\n", "\n", " \n", "\n", "data 0.1122\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949101904+\n", "\n", "+\n", "\n", "\n", "\n", "4949098832->4949101904+\n", "\n", "\n", "\n", "\n", "\n", "4949098832*\n", "\n", "*\n", "\n", "\n", "\n", "4949098832*->4949098832\n", "\n", "\n", "\n", "\n", "\n", "4948836688\n", "\n", " \n", "\n", "data -0.9854\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948846288*\n", "\n", "*\n", "\n", "\n", "\n", "4948836688->4948846288*\n", "\n", "\n", "\n", "\n", "\n", "4934716048*\n", "\n", "*\n", "\n", "\n", "\n", "4948836688->4934716048*\n", "\n", "\n", "\n", "\n", "\n", "4949045776*\n", "\n", "*\n", "\n", "\n", "\n", "4948836688->4949045776*\n", "\n", "\n", "\n", "\n", "\n", "4948884880*\n", "\n", "*\n", "\n", "\n", "\n", "4948836688->4948884880*\n", "\n", "\n", "\n", "\n", "\n", "4948836688tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4948836688tanh->4948836688\n", "\n", "\n", "\n", "\n", "\n", "4934975888\n", "\n", " \n", "\n", "data -0.0203\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934977104+\n", "\n", "+\n", "\n", "\n", "\n", "4934975888->4934977104+\n", "\n", "\n", "\n", "\n", "\n", "4934975888*\n", "\n", "*\n", "\n", "\n", "\n", "4934975888*->4934975888\n", "\n", "\n", "\n", "\n", "\n", "4948705744\n", "\n", " \n", "\n", "data -0.6698\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948842832*\n", "\n", "*\n", "\n", "\n", "\n", "4948705744->4948842832*\n", "\n", "\n", "\n", "\n", "\n", "4934713872\n", "\n", " \n", "\n", "data 2.0122\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948885072tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4934713872->4948885072tanh\n", "\n", "\n", "\n", "\n", "\n", "4934713872+\n", "\n", "+\n", "\n", "\n", "\n", "4934713872+->4934713872\n", "\n", "\n", "\n", "\n", "\n", "4900242000\n", "\n", " \n", "\n", "data 2.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4862366096*\n", "\n", "*\n", "\n", "\n", "\n", "4900242000->4862366096*\n", "\n", "\n", "\n", "\n", "\n", "4948779600\n", "\n", " \n", "\n", "data 0.3288\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948848080+\n", "\n", "+\n", "\n", "\n", "\n", "4948779600->4948848080+\n", "\n", "\n", "\n", "\n", "\n", "4934976144\n", "\n", " \n", "\n", "data -0.5207\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934897744*\n", "\n", "*\n", "\n", "\n", "\n", "4934976144->4934897744*\n", "\n", "\n", "\n", "\n", "\n", "4948845392\n", "\n", " \n", "\n", "data -0.0777\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948848912+\n", "\n", "+\n", "\n", "\n", "\n", "4948845392->4948848912+\n", "\n", "\n", "\n", "\n", "\n", "4948845392+\n", "\n", "+\n", "\n", "\n", "\n", "4948845392+->4948845392\n", "\n", "\n", "\n", "\n", "\n", "4948837392\n", "\n", " \n", "\n", "data 3.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948837648*\n", "\n", "*\n", "\n", "\n", "\n", "4948837392->4948837648*\n", "\n", "\n", "\n", "\n", "\n", "4948870160\n", "\n", " \n", "\n", "data 0.2847\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948871952+\n", "\n", "+\n", "\n", "\n", "\n", "4948870160->4948871952+\n", "\n", "\n", "\n", "\n", "\n", "4948870160+\n", "\n", "+\n", "\n", "\n", "\n", "4948870160+->4948870160\n", "\n", "\n", "\n", "\n", "\n", "4948771920\n", "\n", " \n", "\n", "data 0.3420\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948771920->4934975888*\n", "\n", "\n", "\n", "\n", "\n", "4934984848\n", "\n", " \n", "\n", "data -0.8122\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934718672tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4934984848->4934718672tanh\n", "\n", "\n", "\n", "\n", "\n", "4934984848+\n", "\n", "+\n", "\n", "\n", "\n", "4934984848+->4934984848\n", "\n", "\n", "\n", "\n", "\n", "4934976720\n", "\n", " \n", "\n", "data -0.8842\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934976720->4934984848+\n", "\n", "\n", "\n", "\n", "\n", "4934976720*\n", "\n", "*\n", "\n", "\n", "\n", "4934976720*->4934976720\n", "\n", "\n", "\n", "\n", "\n", "4948870352\n", "\n", " \n", "\n", "data 3.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948872272*\n", "\n", "*\n", "\n", "\n", "\n", "4948870352->4948872272*\n", "\n", "\n", "\n", "\n", "\n", "4948772048\n", "\n", " \n", "\n", "data -0.5617\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949039056+\n", "\n", "+\n", "\n", "\n", "\n", "4948772048->4949039056+\n", "\n", "\n", "\n", "\n", "\n", "4948837648\n", "\n", " \n", "\n", "data -1.4952\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948844176+\n", "\n", "+\n", "\n", "\n", "\n", "4948837648->4948844176+\n", "\n", "\n", "\n", "\n", "\n", "4948837648*->4948837648\n", "\n", "\n", "\n", "\n", "\n", "4955194768\n", "\n", " \n", "\n", "data 2.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934909712*\n", "\n", "*\n", "\n", "\n", "\n", "4955194768->4934909712*\n", "\n", "\n", "\n", "\n", "\n", "4934714832\n", "\n", " \n", "\n", "data -0.0581\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716816+\n", "\n", "+\n", "\n", "\n", "\n", "4934714832->4934716816+\n", "\n", "\n", "\n", "\n", "\n", "4934714832*\n", "\n", "*\n", "\n", "\n", "\n", "4934714832*->4934714832\n", "\n", "\n", "\n", "\n", "\n", "4948837840\n", "\n", " \n", "\n", "data -0.3876\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948837840->4948848080+\n", "\n", "\n", "\n", "\n", "\n", "4948837840*\n", "\n", "*\n", "\n", "\n", "\n", "4948837840*->4948837840\n", "\n", "\n", "\n", "\n", "\n", "4948837904\n", "\n", " \n", "\n", "data -0.8277\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948837904->4934714832*\n", "\n", "\n", "\n", "\n", "\n", "4948837904tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4948837904tanh->4948837904\n", "\n", "\n", "\n", "\n", "\n", "4934977104\n", "\n", " \n", "\n", "data 0.0720\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934977104->4934984848+\n", "\n", "\n", "\n", "\n", "\n", "4934977104+->4934977104\n", "\n", "\n", "\n", "\n", "\n", "4948837968\n", "\n", " \n", "\n", "data -0.0188\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948837968->4948845392+\n", "\n", "\n", "\n", "\n", "\n", "4948837968*\n", "\n", "*\n", "\n", "\n", "\n", "4948837968*->4948837968\n", "\n", "\n", "\n", "\n", "\n", "4948772496\n", "\n", " \n", "\n", "data -0.3059\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948772496->4934716048*\n", "\n", "\n", "\n", "\n", "\n", "4948846288\n", "\n", " \n", "\n", "data -0.7687\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948846800+\n", "\n", "+\n", "\n", "\n", "\n", "4948846288->4948846800+\n", "\n", "\n", "\n", "\n", "\n", "4948846288*->4948846288\n", "\n", "\n", "\n", "\n", "\n", "4948715344\n", "\n", " \n", "\n", "data 0.3585\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949346768*\n", "\n", "*\n", "\n", "\n", "\n", "4948715344->4949346768*\n", "\n", "\n", "\n", "\n", "\n", "4948780944\n", "\n", " \n", "\n", "data 0.4120\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948780944->4948837840*\n", "\n", "\n", "\n", "\n", "\n", "4949043216\n", "\n", " \n", "\n", "data -0.0594\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949043216->4934975888*\n", "\n", "\n", "\n", "\n", "\n", "4949043216tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4949043216tanh->4949043216\n", "\n", "\n", "\n", "\n", "\n", "4934182928\n", "\n", " \n", "\n", "data 0.0360\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934182928->4948870160+\n", "\n", "\n", "\n", "\n", "\n", "4934182928*\n", "\n", "*\n", "\n", "\n", "\n", "4934182928*->4934182928\n", "\n", "\n", "\n", "\n", "\n", "4948781136\n", "\n", " \n", "\n", "data -0.7866\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949043536*\n", "\n", "*\n", "\n", "\n", "\n", "4948781136->4949043536*\n", "\n", "\n", "\n", "\n", "\n", "4948846800\n", "\n", " \n", "\n", "data -1.1807\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948846800->4948837904tanh\n", "\n", "\n", "\n", "\n", "\n", "4948846800+->4948846800\n", "\n", "\n", "\n", "\n", "\n", "4948773136\n", "\n", " \n", "\n", "data -0.9723\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948851600*\n", "\n", "*\n", "\n", "\n", "\n", "4948773136->4948851600*\n", "\n", "\n", "\n", "\n", "\n", "4949354832\n", "\n", " \n", "\n", "data 0.2098\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949349456+\n", "\n", "+\n", "\n", "\n", "\n", "4949354832->4949349456+\n", "\n", "\n", "\n", "\n", "\n", "4949354832+\n", "\n", "+\n", "\n", "\n", "\n", "4949354832+->4949354832\n", "\n", "\n", "\n", "\n", "\n", "4949043536\n", "\n", " \n", "\n", "data 0.7401\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949043536->4949039056+\n", "\n", "\n", "\n", "\n", "\n", "4949043536*->4949043536\n", "\n", "\n", "\n", "\n", "\n", "4949346768\n", "\n", " \n", "\n", "data 0.7171\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949346768->4949357648+\n", "\n", "\n", "\n", "\n", "\n", "4949346768*->4949346768\n", "\n", "\n", "\n", "\n", "\n", "4948773392\n", "\n", " \n", "\n", "data 0.0702\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948773392->4934714832*\n", "\n", "\n", "\n", "\n", "\n", "4949035536\n", "\n", " \n", "\n", "data -0.0404\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949044240+\n", "\n", "+\n", "\n", "\n", "\n", "4949035536->4949044240+\n", "\n", "\n", "\n", "\n", "\n", "4949035536+\n", "\n", "+\n", "\n", "\n", "\n", "4949035536+->4949035536\n", "\n", "\n", "\n", "\n", "\n", "4934724176\n", "\n", " \n", "\n", "data 0.4323\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716240+\n", "\n", "+\n", "\n", "\n", "\n", "4934724176->4934716240+\n", "\n", "\n", "\n", "\n", "\n", "4934724176*->4934724176\n", "\n", "\n", "\n", "\n", "\n", "4934716048\n", "\n", " \n", "\n", "data 0.3014\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716048->4934713872+\n", "\n", "\n", "\n", "\n", "\n", "4934716048*->4934716048\n", "\n", "\n", "\n", "\n", "\n", "4948789968\n", "\n", " \n", "\n", "data -1.7461\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949183312tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4948789968->4949183312tanh\n", "\n", "\n", "\n", "\n", "\n", "4948789968+\n", "\n", "+\n", "\n", "\n", "\n", "4948789968+->4948789968\n", "\n", "\n", "\n", "\n", "\n", "4934716176\n", "\n", " \n", "\n", "data 0.9392\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716176->4934716240+\n", "\n", "\n", "\n", "\n", "\n", "4934716176+\n", "\n", "+\n", "\n", "\n", "\n", "4934716176+->4934716176\n", "\n", "\n", "\n", "\n", "\n", "4948871952\n", "\n", " \n", "\n", "data -0.0595\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948871952->4949043216tanh\n", "\n", "\n", "\n", "\n", "\n", "4948871952+->4948871952\n", "\n", "\n", "\n", "\n", "\n", "4934716240\n", "\n", " \n", "\n", "data 1.3715\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716688+\n", "\n", "+\n", "\n", "\n", "\n", "4934716240->4934716688+\n", "\n", "\n", "\n", "\n", "\n", "4934716240+->4934716240\n", "\n", "\n", "\n", "\n", "\n", "4949183312\n", "\n", " \n", "\n", "data -0.9409\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949183312->4948837840*\n", "\n", "\n", "\n", "\n", "\n", "4949183312->4949043536*\n", "\n", "\n", "\n", "\n", "\n", "4949183312->4948851600*\n", "\n", "\n", "\n", "\n", "\n", "4934728784*\n", "\n", "*\n", "\n", "\n", "\n", "4949183312->4934728784*\n", "\n", "\n", "\n", "\n", "\n", "4949183312tanh->4949183312\n", "\n", "\n", "\n", "\n", "\n", "4934896592\n", "\n", " \n", "\n", "data 3.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934896592->4934897744*\n", "\n", "\n", "\n", "\n", "\n", "4948716496\n", "\n", " \n", "\n", "data -0.1122\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948716496->4949098832*\n", "\n", "\n", "\n", "\n", "\n", "4948782096\n", "\n", " \n", "\n", "data 0.5509\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934719056*\n", "\n", "*\n", "\n", "\n", "\n", "4948782096->4934719056*\n", "\n", "\n", "\n", "\n", "\n", "4949044240\n", "\n", " \n", "\n", "data 0.3125\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949045264+\n", "\n", "+\n", "\n", "\n", "\n", "4949044240->4949045264+\n", "\n", "\n", "\n", "\n", "\n", "4949044240+->4949044240\n", "\n", "\n", "\n", "\n", "\n", "4949355600\n", "\n", " \n", "\n", "data -1.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949350160*\n", "\n", "*\n", "\n", "\n", "\n", "4949355600->4949350160*\n", "\n", "\n", "\n", "\n", "\n", "4948872272\n", "\n", " \n", "\n", "data -0.9589\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948884816+\n", "\n", "+\n", "\n", "\n", "\n", "4948872272->4948884816+\n", "\n", "\n", "\n", "\n", "\n", "4948872272*->4948872272\n", "\n", "\n", "\n", "\n", "\n", "4934716688\n", "\n", " \n", "\n", "data 1.7107\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934716688->4934713872+\n", "\n", "\n", "\n", "\n", "\n", "4934716688+->4934716688\n", "\n", "\n", "\n", "\n", "\n", "4949101904\n", "\n", " \n", "\n", "data 0.5454\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948792976tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4949101904->4948792976tanh\n", "\n", "\n", "\n", "\n", "\n", "4949101904+->4949101904\n", "\n", "\n", "\n", "\n", "\n", "4948782416\n", "\n", " \n", "\n", "data -0.8630\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949041040*\n", "\n", "*\n", "\n", "\n", "\n", "4948782416->4949041040*\n", "\n", "\n", "\n", "\n", "\n", "4949101968\n", "\n", " \n", "\n", "data -1.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949101968->4949098832*\n", "\n", "\n", "\n", "\n", "\n", "4948880784\n", "\n", " \n", "\n", "data 1.3921\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948880784->4948884816+\n", "\n", "\n", "\n", "\n", "\n", "4948880784+\n", "\n", "+\n", "\n", "\n", "\n", "4948880784+->4948880784\n", "\n", "\n", "\n", "\n", "\n", "4934716816\n", "\n", " \n", "\n", "data 0.3782\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934717456+\n", "\n", "+\n", "\n", "\n", "\n", "4934716816->4934717456+\n", "\n", "\n", "\n", "\n", "\n", "4934716816+->4934716816\n", "\n", "\n", "\n", "\n", "\n", "4949355984\n", "\n", " \n", "\n", "data -0.5084\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949355984->4949354832+\n", "\n", "\n", "\n", "\n", "\n", "4949355984*->4949355984\n", "\n", "\n", "\n", "\n", "\n", "4948848080\n", "\n", " \n", "\n", "data -0.0589\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948848080->4948845392+\n", "\n", "\n", "\n", "\n", "\n", "4948848080+->4948848080\n", "\n", "\n", "\n", "\n", "\n", "4948708880\n", "\n", " \n", "\n", "data -0.9585\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948708880->4948844560*\n", "\n", "\n", "\n", "\n", "\n", "4949405264\n", "\n", " \n", "\n", "data -0.4314\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934907536+\n", "\n", "+\n", "\n", "\n", "\n", "4949405264->4934907536+\n", "\n", "\n", "\n", "\n", "\n", "4948782672\n", "\n", " \n", "\n", "data -0.9163\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948782672->4934976720*\n", "\n", "\n", "\n", "\n", "\n", "4949044944\n", "\n", " \n", "\n", "data 0.3529\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949044944->4949044240+\n", "\n", "\n", "\n", "\n", "\n", "4949044944*\n", "\n", "*\n", "\n", "\n", "\n", "4949044944*->4949044944\n", "\n", "\n", "\n", "\n", "\n", "4934905680\n", "\n", " \n", "\n", "data -2.6605\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934905680->4948789968+\n", "\n", "\n", "\n", "\n", "\n", "4934905680+\n", "\n", "+\n", "\n", "\n", "\n", "4934905680+->4934905680\n", "\n", "\n", "\n", "\n", "\n", "4948717392\n", "\n", " \n", "\n", "data -0.5783\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948841552+\n", "\n", "+\n", "\n", "\n", "\n", "4948717392->4948841552+\n", "\n", "\n", "\n", "\n", "\n", "4934717328\n", "\n", " \n", "\n", "data -0.2859\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934717328->4934717456+\n", "\n", "\n", "\n", "\n", "\n", "4934717328*->4934717328\n", "\n", "\n", "\n", "\n", "\n", "4934717456\n", "\n", " \n", "\n", "data 0.0923\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934717456->4934977104+\n", "\n", "\n", "\n", "\n", "\n", "4934717456+->4934717456\n", "\n", "\n", "\n", "\n", "\n", "4949045264\n", "\n", " \n", "\n", "data 0.4358\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949045264->4948844816tanh\n", "\n", "\n", "\n", "\n", "\n", "4949045264+->4949045264\n", "\n", "\n", "\n", "\n", "\n", "4934897744\n", "\n", " \n", "\n", "data -1.5620\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934897744->4934905680+\n", "\n", "\n", "\n", "\n", "\n", "4934897744*->4934897744\n", "\n", "\n", "\n", "\n", "\n", "4949045328\n", "\n", " \n", "\n", "data 0.3886\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949045328->4949035536+\n", "\n", "\n", "\n", "\n", "\n", "4949045328+\n", "\n", "+\n", "\n", "\n", "\n", "4949045328+->4949045328\n", "\n", "\n", "\n", "\n", "\n", "4948848912\n", "\n", " \n", "\n", "data -0.4121\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948848912->4948846800+\n", "\n", "\n", "\n", "\n", "\n", "4948848912+->4948848912\n", "\n", "\n", "\n", "\n", "\n", "4862366096\n", "\n", " \n", "\n", "data 1.6636\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4862366096->4948880784+\n", "\n", "\n", "\n", "\n", "\n", "4862366096*->4862366096\n", "\n", "\n", "\n", "\n", "\n", "4948840976\n", "\n", " \n", "\n", "data 2.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948840976->4948842832*\n", "\n", "\n", "\n", "\n", "\n", "4948783632\n", "\n", " \n", "\n", "data -0.5263\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948783632->4949045328+\n", "\n", "\n", "\n", "\n", "\n", "4949045776\n", "\n", " \n", "\n", "data 0.1233\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949045776->4949045264+\n", "\n", "\n", "\n", "\n", "\n", "4949045776*->4949045776\n", "\n", "\n", "\n", "\n", "\n", "4948775568\n", "\n", " \n", "\n", "data -0.6972\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948775568->4934717328*\n", "\n", "\n", "\n", "\n", "\n", "4948849552\n", "\n", " \n", "\n", "data -0.3344\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948849552->4948848912+\n", "\n", "\n", "\n", "\n", "\n", "4948849552*\n", "\n", "*\n", "\n", "\n", "\n", "4948849552*->4948849552\n", "\n", "\n", "\n", "\n", "\n", "4948849680\n", "\n", " \n", "\n", "data -2.4547\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948849680->4948836688tanh\n", "\n", "\n", "\n", "\n", "\n", "4948849680+->4948849680\n", "\n", "\n", "\n", "\n", "\n", "4948784144\n", "\n", " \n", "\n", "data 0.5732\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948784144->4949044944*\n", "\n", "\n", "\n", "\n", "\n", "4948841552\n", "\n", " \n", "\n", "data -1.9180\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948841552->4948844176+\n", "\n", "\n", "\n", "\n", "\n", "4948841552+->4948841552\n", "\n", "\n", "\n", "\n", "\n", "4949349456\n", "\n", " \n", "\n", "data 0.7181\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949349840tanh\n", "\n", "tanh\n", "\n", "\n", "\n", "4949349456->4949349840tanh\n", "\n", "\n", "\n", "\n", "\n", "4949349456+->4949349456\n", "\n", "\n", "\n", "\n", "\n", "4948718672\n", "\n", " \n", "\n", "data -0.5083\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948718672->4949350160*\n", "\n", "\n", "\n", "\n", "\n", "4949357648\n", "\n", " \n", "\n", "data 0.7182\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949357648->4949354832+\n", "\n", "\n", "\n", "\n", "\n", "4949357648+->4949357648\n", "\n", "\n", "\n", "\n", "\n", "4934718672\n", "\n", " \n", "\n", "data -0.6708\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934718672tanh->4934718672\n", "\n", "\n", "\n", "\n", "\n", "4948784400\n", "\n", " \n", "\n", "data -0.2717\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948784400->4934728784*\n", "\n", "\n", "\n", "\n", "\n", "4934899024\n", "\n", " \n", "\n", "data -1.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948794128*\n", "\n", "*\n", "\n", "\n", "\n", "4934899024->4948794128*\n", "\n", "\n", "\n", "\n", "\n", "4949349840\n", "\n", " \n", "\n", "data 0.6157\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949349840->4934182928*\n", "\n", "\n", "\n", "\n", "\n", "4949349840->4949044944*\n", "\n", "\n", "\n", "\n", "\n", "4949349840->4948849552*\n", "\n", "\n", "\n", "\n", "\n", "4949349840->4934719056*\n", "\n", "\n", "\n", "\n", "\n", "4949349840tanh->4949349840\n", "\n", "\n", "\n", "\n", "\n", "4948776400\n", "\n", " \n", "\n", "data 0.3493\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948776400->4948884880*\n", "\n", "\n", "\n", "\n", "\n", "4934719056\n", "\n", " \n", "\n", "data 0.3392\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934719056->4934716688+\n", "\n", "\n", "\n", "\n", "\n", "4934719056*->4934719056\n", "\n", "\n", "\n", "\n", "\n", "4934907536\n", "\n", " \n", "\n", "data -1.0985\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934907536->4934905680+\n", "\n", "\n", "\n", "\n", "\n", "4934907536+->4934907536\n", "\n", "\n", "\n", "\n", "\n", "4948792976\n", "\n", " \n", "\n", "data 0.4971\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948792976->4948837968*\n", "\n", "\n", "\n", "\n", "\n", "4948792976->4934724176*\n", "\n", "\n", "\n", "\n", "\n", "4949122832*\n", "\n", "*\n", "\n", "\n", "\n", "4948792976->4949122832*\n", "\n", "\n", "\n", "\n", "\n", "4948792976->4949041040*\n", "\n", "\n", "\n", "\n", "\n", "4948792976tanh->4948792976\n", "\n", "\n", "\n", "\n", "\n", "4949350160\n", "\n", " \n", "\n", "data 0.5083\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949350160->4949349456+\n", "\n", "\n", "\n", "\n", "\n", "4949350160*->4949350160\n", "\n", "\n", "\n", "\n", "\n", "4949120784\n", "\n", " \n", "\n", "data 0.2487\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949120784->4948870160+\n", "\n", "\n", "\n", "\n", "\n", "4949120784+\n", "\n", "+\n", "\n", "\n", "\n", "4949120784+->4949120784\n", "\n", "\n", "\n", "\n", "\n", "4948776912\n", "\n", " \n", "\n", "data -0.5431\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948776912->4948849552*\n", "\n", "\n", "\n", "\n", "\n", "4948785104\n", "\n", " \n", "\n", "data -0.0378\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948785104->4948837968*\n", "\n", "\n", "\n", "\n", "\n", "4949039056\n", "\n", " \n", "\n", "data 0.1785\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949039056->4949120784+\n", "\n", "\n", "\n", "\n", "\n", "4949039056+->4949039056\n", "\n", "\n", "\n", "\n", "\n", "4948785232\n", "\n", " \n", "\n", "data 0.4363\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948785232->4934716816+\n", "\n", "\n", "\n", "\n", "\n", "4948842832\n", "\n", " \n", "\n", "data -1.3397\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948842832->4948841552+\n", "\n", "\n", "\n", "\n", "\n", "4948842832*->4948842832\n", "\n", "\n", "\n", "\n", "\n", "4948842960\n", "\n", " \n", "\n", "data -1.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948842960->4948844560*\n", "\n", "\n", "\n", "\n", "\n", "4948785808\n", "\n", " \n", "\n", "data 0.7801\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948785808->4948846288*\n", "\n", "\n", "\n", "\n", "\n", "4948785872\n", "\n", " \n", "\n", "data 0.0585\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948785872->4934182928*\n", "\n", "\n", "\n", "\n", "\n", "4948794128\n", "\n", " \n", "\n", "data 0.9144\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948794128->4948789968+\n", "\n", "\n", "\n", "\n", "\n", "4948794128*->4948794128\n", "\n", "\n", "\n", "\n", "\n", "4948720464\n", "\n", " \n", "\n", "data -0.1695\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948720464->4949355984*\n", "\n", "\n", "\n", "\n", "\n", "4948851600\n", "\n", " \n", "\n", "data 0.9149\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948851600->4949045328+\n", "\n", "\n", "\n", "\n", "\n", "4948851600*->4948851600\n", "\n", "\n", "\n", "\n", "\n", "4948786128\n", "\n", " \n", "\n", "data -0.1251\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948786128->4949045776*\n", "\n", "\n", "\n", "\n", "\n", "4948720656\n", "\n", " \n", "\n", "data -0.2715\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948720656->4948880784+\n", "\n", "\n", "\n", "\n", "\n", "4934728784\n", "\n", " \n", "\n", "data 0.2557\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934728784->4934716176+\n", "\n", "\n", "\n", "\n", "\n", "4934728784*->4934728784\n", "\n", "\n", "\n", "\n", "\n", "4948712592\n", "\n", " \n", "\n", "data -0.4984\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948712592->4948837648*\n", "\n", "\n", "\n", "\n", "\n", "4934991120\n", "\n", " \n", "\n", "data 0.8318\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934991120->4862366096*\n", "\n", "\n", "\n", "\n", "\n", "4948884816\n", "\n", " \n", "\n", "data 0.4332\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948884816->4949101904+\n", "\n", "\n", "\n", "\n", "\n", "4948884816+->4948884816\n", "\n", "\n", "\n", "\n", "\n", "4948884880\n", "\n", " \n", "\n", "data -0.3441\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948884880->4948871952+\n", "\n", "\n", "\n", "\n", "\n", "4948884880*->4948884880\n", "\n", "\n", "\n", "\n", "\n", "4934991376\n", "\n", " \n", "\n", "data -0.9144\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934991376->4948794128*\n", "\n", "\n", "\n", "\n", "\n", "4948885072\n", "\n", " \n", "\n", "data 0.9649\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948885072->4934976720*\n", "\n", "\n", "\n", "\n", "\n", "4948885072tanh->4948885072\n", "\n", "\n", "\n", "\n", "\n", "4948844176\n", "\n", " \n", "\n", "data -3.4133\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948844176->4948849680+\n", "\n", "\n", "\n", "\n", "\n", "4948844176+->4948844176\n", "\n", "\n", "\n", "\n", "\n", "4948778640\n", "\n", " \n", "\n", "data 0.1413\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948778640->4949122832*\n", "\n", "\n", "\n", "\n", "\n", "4934991568\n", "\n", " \n", "\n", "data -0.3196\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934991568->4948872272*\n", "\n", "\n", "\n", "\n", "\n", "4934909712\n", "\n", " \n", "\n", "data -0.6670\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4934909712->4934907536+\n", "\n", "\n", "\n", "\n", "\n", "4934909712*->4934909712\n", "\n", "\n", "\n", "\n", "\n", "4949122832\n", "\n", " \n", "\n", "data 0.0702\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949122832->4949120784+\n", "\n", "\n", "\n", "\n", "\n", "4949122832*->4949122832\n", "\n", "\n", "\n", "\n", "\n", "4949352272\n", "\n", " \n", "\n", "data 2.0000\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949352272->4949346768*\n", "\n", "\n", "\n", "\n", "\n", "4948778832\n", "\n", " \n", "\n", "data 0.6836\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948778832->4934716176+\n", "\n", "\n", "\n", "\n", "\n", "4949041040\n", "\n", " \n", "\n", "data -0.4290\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4949041040->4949035536+\n", "\n", "\n", "\n", "\n", "\n", "4949041040*->4949041040\n", "\n", "\n", "\n", "\n", "\n", "4948795344\n", "\n", " \n", "\n", "data -0.3335\n", "\n", "grad 0.0000\n", "\n", "\n", "\n", "4948795344->4934909712*\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 123, "metadata": {}, "output_type": "execute_result" } ], "source": [ "draw_dot(n(x)) #print the entire mlp. this will be huge :)" ] }, { "cell_type": "markdown", "id": "69b68a79-e739-4c5d-ae50-aa88be7cc01f", "metadata": {}, "source": [ "## We just made a huge Multi-Layer Perceptron with PyTorch, using the same forward pass and backward pass principles from before. Now, we'll make a dataset and see how this would be run at a larger scale." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 5 }