{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Activity 5 - Neural Networks\n",
    "\n",
    "In these examples, we show how you can create a small neural network. We have our input X, and our outputs Y, and we want to train a small network to map input to output (in this example, if the first and third column are 1 then the output is 1 - that is what the network will effectively learn).\n",
    "\n",
    "More details on these examples at iamtrask blog: https://iamtrask.github.io/2015/07/12/basic-python-network/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "# input dataset\n",
    "X = np.array([  [0,0,1],\n",
    "                [0,1,1],\n",
    "                [1,0,1],\n",
    "                [1,1,1] ])\n",
    "    \n",
    "# output dataset            \n",
    "y = np.array([[0,0,1,1]]).T\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Output After Training:\n",
      "[[0.00966449]\n",
      " [0.00786506]\n",
      " [0.99358898]\n",
      " [0.99211957]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "# sigmoid function\n",
    "def nonlin(x,deriv=False):\n",
    "    if(deriv==True):\n",
    "        return x*(1-x)\n",
    "    return 1/(1+np.exp(-x))\n",
    "    \n",
    "# seed random numbers to make calculation\n",
    "# deterministic (just a good practice)\n",
    "np.random.seed(1)\n",
    "\n",
    "# initialize weights randomly with mean 0\n",
    "syn0 = 2*np.random.random((3,1)) - 1\n",
    "\n",
    "for iter in range(10000):\n",
    "\n",
    "    # forward propagation\n",
    "    l0 = X\n",
    "    l1 = nonlin(np.dot(l0,syn0))\n",
    "\n",
    "    # how much did we miss?\n",
    "    l1_error = y - l1\n",
    "\n",
    "    # multiply how much we missed by the \n",
    "    # slope of the sigmoid at the values in l1\n",
    "    l1_delta = l1_error * nonlin(l1,True)\n",
    "\n",
    "    # update weights\n",
    "    syn0 += np.dot(l0.T,l1_delta)\n",
    "\n",
    "print (\"Output After Training:\")\n",
    "print (l1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We want to learn the set of weights that syn0 should hold, such that the error between l1 (the thing being predicted), and Y (the values that are known) is minimised. The output matrix after training shows that we achieve very good results (the values 0.99 are effectively 1, and the others are effectively 0, which matches with our original output labels)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error:0.4685343254580603\n",
      "Error:0.005002426725395313\n",
      "Error:0.00345440546153305\n",
      "Error:0.002786557019672355\n",
      "Error:0.0023941155055209216\n",
      "Error:0.0021288852682254146\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "def nonlin(x,deriv=False):\n",
    "    if(deriv==True):\n",
    "        return x*(1-x)\n",
    "    return 1/(1+np.exp(-x))\n",
    "    \n",
    "np.random.seed(1)\n",
    "\n",
    "# randomly initialize our weights with mean 0\n",
    "syn0 = 2*np.random.random((3,4)) - 1\n",
    "syn1 = 2*np.random.random((4,1)) - 1\n",
    "\n",
    "for j in range(60000):\n",
    "    # Feed forward through layers 0, 1, and 2\n",
    "    l0 = X\n",
    "    l1 = nonlin(np.dot(l0,syn0))\n",
    "    l2 = nonlin(np.dot(l1,syn1))\n",
    "\n",
    "    # how much did we miss the target value?\n",
    "    l2_error = y - l2\n",
    "    \n",
    "    if (j% 10000) == 0:\n",
    "        print (\"Error:\" + str(np.mean(np.abs(l2_error))))\n",
    "        \n",
    "    # in what direction is the target value?\n",
    "    # were we really sure? if so, don't change too much.\n",
    "    l2_delta = l2_error*nonlin(l2,deriv=True)\n",
    "\n",
    "    # how much did each l1 value contribute to the l2 error (according to the weights)?\n",
    "    l1_error = l2_delta.dot(syn1.T)\n",
    "    \n",
    "    # in what direction is the target l1?\n",
    "    # were we really sure? if so, don't change too much.\n",
    "    l1_delta = l1_error * nonlin(l1,deriv=True)\n",
    "\n",
    "    syn1 += l1.T.dot(l2_delta)\n",
    "    syn0 += l0.T.dot(l1_delta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this next example, we have now extended to two layers within our network, so we have syn0 and syn1 as random weight matrices. Just as before, we update our weights based on the error at the output, but we then have to step backwards and update the first weight matrix based on the second matrix (so we work backwards with the error - known as backpropagation)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}