{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Feedforward neural network with regularization\n",
    "\n",
    "Neural networks is a popular and powerful type of machine learning. This algorithm tries to model the way the brain thinks. There are many types of neural networks, in this notebook I'll implement a basic feedforward network as it is described in Andrew Ng's course on Coursera."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![GitHub Logo](http://dl2.joxi.net/drive/2016/11/18/0012/1517/833005/05/922c2d7881.png)\n",
    "\n",
    "The structure of this neural network is simple: input layer, one hidden layer and output layer. Regularization is added to prevent overfitting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import random\n",
    "from scipy.special import expit\n",
    "import scipy.optimize\n",
    "from scipy.optimize import minimize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "train = pd.read_csv('../input/train.csv')\n",
    "test = pd.read_csv('../input/test.csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "I add the same new features as in the other [notebook](http://nbviewer.jupyter.org/github/Erlemar/Erlemar.github.io/blob/master/Notebooks/GGG.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train['hair_soul'] = train['hair_length'] * train['has_soul']\n",
    "train['hair_bone'] = train['hair_length'] * train['bone_length']\n",
    "test['hair_soul'] = test['hair_length'] * test['has_soul']\n",
    "test['hair_bone'] = test['hair_length'] * test['bone_length']\n",
    "train['hair_soul_bone'] = train['hair_length'] * train['has_soul'] * train['bone_length']\n",
    "test['hair_soul_bone'] = test['hair_length'] * test['has_soul'] * test['bone_length']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = np.array(train.drop(['id', 'color', 'type'], axis=1))\n",
    "X = np.insert(X,0,1,axis=1)\n",
    "X_test = np.array(test.drop(['id', 'color'], axis=1))\n",
    "X_test = np.insert(X_test,0,1,axis=1)\n",
    "Y_train = np.array((pd.get_dummies(train['type'], drop_first=False)).astype(float))\n",
    "#I'll need this for predictions.\n",
    "monsters = (pd.get_dummies(train['type'], drop_first=False)).columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These are the parameters of neural network. I added additional column to variables as bias, so the input size is 8. Number of nodes in hidden layer is arbitraty, I chose 12 after some test. Params - random initial weights with the same size as the network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "hidden_size = 12\n",
    "learning_rate = 1\n",
    "params = (np.random.random(size=hidden_size * (X.shape[1]) + Y_train.shape[1] * (hidden_size + 1)) - 0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Forwardpropagation. Input is multiplied by weights, after that goes hidden layer with sigmoid function and output with sigmoid function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def forward_propagate(X, theta1, theta2):\n",
    "    z2 = X * theta1.T\n",
    "    a2 = np.insert(expit(z2), 0, 1, axis=1) \n",
    "    a3 = expit(a2 * theta2.T)\n",
    "    return z2, a2, a3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Backpropagation. \"Going back\" to minimize the error. And adding regularization here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def back_propagate(X, y, theta1, theta2, z2, a2, a3):\n",
    "    D1 = np.zeros(theta1.shape)\n",
    "    D2 = np.zeros(theta2.shape)\n",
    "    \n",
    "    for t in range(len(X)):\n",
    "        z2t = z2[t,:]\n",
    "        \n",
    "        d3t = a3[t,:] - y[t,:]\n",
    "        z2t = np.insert(z2t, 0, values=1)\n",
    "        d2t = np.multiply((theta2.T * d3t.T).T, np.multiply(expit(z2t), (1 - expit(z2t))))\n",
    "        \n",
    "        D1 += (d2t[:,1:]).T * X[t,:]\n",
    "        D2 += d3t.T * a2[t,:]\n",
    "        \n",
    "    D1 = D1 / len(X)\n",
    "    D2 = D2 / len(X)\n",
    "    \n",
    "    D1[:,1:] += (theta1[:,1:] * learning_rate) / len(X)\n",
    "    D2[:,1:] += (theta2[:,1:] * learning_rate) / len(X)\n",
    "    return D1, D2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cost function. Convert input and output into matrixes. Divide params into thetas. Then forwardpropagate and calculate loss with regularization. After that backpropagate to minimize cost."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2.0055905698764396, (135,))"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def cost(params, X, y, learningRate):  \n",
    "    X = np.matrix(X)\n",
    "    y = np.matrix(y)\n",
    "    theta1 = np.matrix(np.reshape(params[:hidden_size * (X.shape[1])], (hidden_size, (X.shape[1]))))\n",
    "    theta2 = np.matrix(np.reshape(params[hidden_size * (X.shape[1]):], (Y_train.shape[1], (hidden_size + 1))))\n",
    "\n",
    "    z2, a2, a3 = forward_propagate(X, theta1, theta2)\n",
    "    J = 0\n",
    "    for i in range(len(X)):\n",
    "        first_term = np.multiply(-y[i,:], np.log(a3[i,:]))\n",
    "        second_term = np.multiply((1 - y[i,:]), np.log(1 - a3[i,:]))\n",
    "        J += np.sum(first_term - second_term)\n",
    "    \n",
    "    J = (J + (float(learningRate) / 2) * (np.sum(np.power(theta1[:,1:], 2)) + np.sum(np.power(theta2[:,1:], 2)))) / len(X)\n",
    "    \n",
    "    #Backpropagation\n",
    "    D1, D2 = back_propagate(X, y, theta1, theta2, z2, a2, a3)\n",
    "    \n",
    "    #Unravel the gradient into a single array.\n",
    "    grad = np.concatenate((np.ravel(D1), np.ravel(D2)))\n",
    "    return J, grad\n",
    "#Simply to see that this works.\n",
    "J, grad = cost(params, X, Y_train, 1)\n",
    "J, grad.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Minimizing function.\n",
    "fmin = minimize(cost, x0=params, args=(X, Y_train, learning_rate), method='TNC', jac=True, options={'maxiter': 600})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Get the optimized weights and use them to get output. \n",
    "theta1 = np.matrix(np.reshape(fmin.x[:hidden_size * (X.shape[1])], (hidden_size, (X.shape[1]))))\n",
    "theta2 = np.matrix(np.reshape(fmin.x[hidden_size * (X.shape[1]):], (Y_train.shape[1], (hidden_size + 1))))\n",
    "z2, a2, a3 = forward_propagate(X, theta1, theta2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Prediction is in form of probabilities for each class. Get the class with highest probability.\n",
    "def pred(a):\n",
    "    for i in range(len(a)):\n",
    "        yield monsters[np.argmax(a[i])]\n",
    "prediction = list(pred(a3))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy = 76.01078167115904%\n"
     ]
    }
   ],
   "source": [
    "#Accuracy on training dataset.\n",
    "accuracy = sum(prediction == train['type']) / len (train['type'])\n",
    "print('accuracy = {0}%'.format(accuracy * 100))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Predict on test set.\n",
    "z2, a2, a3_test = forward_propagate(X_test, theta1, theta2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction_test = list(pred(a3_test))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "submission = pd.DataFrame({'id':test['id'], 'type':prediction_test})\n",
    "submission.to_csv('GGG_submission.csv', index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "I got an accuracy of ~0.741 with this neural network. A good result, considering that my ensemble got ~0.748."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}