{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Feedforward neural network with regularization\n", "\n", "Neural networks is a popular and powerful type of machine learning. This algorithm tries to model the way the brain thinks. There are many types of neural networks, in this notebook I'll implement a basic feedforward network as it is described in Andrew Ng's course on Coursera." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![GitHub Logo](http://dl2.joxi.net/drive/2016/11/18/0012/1517/833005/05/922c2d7881.png)\n", "\n", "The structure of this neural network is simple: input layer, one hidden layer and output layer. Regularization is added to prevent overfitting." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import random\n", "from scipy.special import expit\n", "import scipy.optimize\n", "from scipy.optimize import minimize" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "train = pd.read_csv('../input/train.csv')\n", "test = pd.read_csv('../input/test.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I add the same new features as in the other [notebook](http://nbviewer.jupyter.org/github/Erlemar/Erlemar.github.io/blob/master/Notebooks/GGG.ipynb)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "train['hair_soul'] = train['hair_length'] * train['has_soul']\n", "train['hair_bone'] = train['hair_length'] * train['bone_length']\n", "test['hair_soul'] = test['hair_length'] * test['has_soul']\n", "test['hair_bone'] = test['hair_length'] * test['bone_length']\n", "train['hair_soul_bone'] = train['hair_length'] * train['has_soul'] * train['bone_length']\n", "test['hair_soul_bone'] = test['hair_length'] * test['has_soul'] * test['bone_length']" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "X = np.array(train.drop(['id', 'color', 'type'], axis=1))\n", "X = np.insert(X,0,1,axis=1)\n", "X_test = np.array(test.drop(['id', 'color'], axis=1))\n", "X_test = np.insert(X_test,0,1,axis=1)\n", "Y_train = np.array((pd.get_dummies(train['type'], drop_first=False)).astype(float))\n", "#I'll need this for predictions.\n", "monsters = (pd.get_dummies(train['type'], drop_first=False)).columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These are the parameters of neural network. I added additional column to variables as bias, so the input size is 8. Number of nodes in hidden layer is arbitraty, I chose 12 after some test. Params - random initial weights with the same size as the network." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "hidden_size = 12\n", "learning_rate = 1\n", "params = (np.random.random(size=hidden_size * (X.shape[1]) + Y_train.shape[1] * (hidden_size + 1)) - 0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Forwardpropagation. Input is multiplied by weights, after that goes hidden layer with sigmoid function and output with sigmoid function." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def forward_propagate(X, theta1, theta2):\n", " z2 = X * theta1.T\n", " a2 = np.insert(expit(z2), 0, 1, axis=1) \n", " a3 = expit(a2 * theta2.T)\n", " return z2, a2, a3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Backpropagation. \"Going back\" to minimize the error. And adding regularization here." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def back_propagate(X, y, theta1, theta2, z2, a2, a3):\n", " D1 = np.zeros(theta1.shape)\n", " D2 = np.zeros(theta2.shape)\n", " \n", " for t in range(len(X)):\n", " z2t = z2[t,:]\n", " \n", " d3t = a3[t,:] - y[t,:]\n", " z2t = np.insert(z2t, 0, values=1)\n", " d2t = np.multiply((theta2.T * d3t.T).T, np.multiply(expit(z2t), (1 - expit(z2t))))\n", " \n", " D1 += (d2t[:,1:]).T * X[t,:]\n", " D2 += d3t.T * a2[t,:]\n", " \n", " D1 = D1 / len(X)\n", " D2 = D2 / len(X)\n", " \n", " D1[:,1:] += (theta1[:,1:] * learning_rate) / len(X)\n", " D2[:,1:] += (theta2[:,1:] * learning_rate) / len(X)\n", " return D1, D2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cost function. Convert input and output into matrixes. Divide params into thetas. Then forwardpropagate and calculate loss with regularization. After that backpropagate to minimize cost." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2.0055905698764396, (135,))" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def cost(params, X, y, learningRate): \n", " X = np.matrix(X)\n", " y = np.matrix(y)\n", " theta1 = np.matrix(np.reshape(params[:hidden_size * (X.shape[1])], (hidden_size, (X.shape[1]))))\n", " theta2 = np.matrix(np.reshape(params[hidden_size * (X.shape[1]):], (Y_train.shape[1], (hidden_size + 1))))\n", "\n", " z2, a2, a3 = forward_propagate(X, theta1, theta2)\n", " J = 0\n", " for i in range(len(X)):\n", " first_term = np.multiply(-y[i,:], np.log(a3[i,:]))\n", " second_term = np.multiply((1 - y[i,:]), np.log(1 - a3[i,:]))\n", " J += np.sum(first_term - second_term)\n", " \n", " J = (J + (float(learningRate) / 2) * (np.sum(np.power(theta1[:,1:], 2)) + np.sum(np.power(theta2[:,1:], 2)))) / len(X)\n", " \n", " #Backpropagation\n", " D1, D2 = back_propagate(X, y, theta1, theta2, z2, a2, a3)\n", " \n", " #Unravel the gradient into a single array.\n", " grad = np.concatenate((np.ravel(D1), np.ravel(D2)))\n", " return J, grad\n", "#Simply to see that this works.\n", "J, grad = cost(params, X, Y_train, 1)\n", "J, grad.shape" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true }, "outputs": [], "source": [ "#Minimizing function.\n", "fmin = minimize(cost, x0=params, args=(X, Y_train, learning_rate), method='TNC', jac=True, options={'maxiter': 600})" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "#Get the optimized weights and use them to get output. \n", "theta1 = np.matrix(np.reshape(fmin.x[:hidden_size * (X.shape[1])], (hidden_size, (X.shape[1]))))\n", "theta2 = np.matrix(np.reshape(fmin.x[hidden_size * (X.shape[1]):], (Y_train.shape[1], (hidden_size + 1))))\n", "z2, a2, a3 = forward_propagate(X, theta1, theta2)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "#Prediction is in form of probabilities for each class. Get the class with highest probability.\n", "def pred(a):\n", " for i in range(len(a)):\n", " yield monsters[np.argmax(a[i])]\n", "prediction = list(pred(a3))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy = 76.01078167115904%\n" ] } ], "source": [ "#Accuracy on training dataset.\n", "accuracy = sum(prediction == train['type']) / len (train['type'])\n", "print('accuracy = {0}%'.format(accuracy * 100))" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "#Predict on test set.\n", "z2, a2, a3_test = forward_propagate(X_test, theta1, theta2)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "prediction_test = list(pred(a3_test))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "submission = pd.DataFrame({'id':test['id'], 'type':prediction_test})\n", "submission.to_csv('GGG_submission.csv', index=False)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "I got an accuracy of ~0.741 with this neural network. A good result, considering that my ensemble got ~0.748." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [Root]", "language": "python", "name": "Python [Root]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 1 }