{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Fully-Connected Layers Tutorial on Fashion MNIST Data Set " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a naive tutorial on how to use `FCLayer` (Fully-connected Layer) to train and predict the Fashion MNIST data set" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### First Let's load the Packages" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Plots.GRBackend()" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using MLDatasets\n", "using NumNN\n", "using Plots\n", "gr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Temp for ProgressMeter.jl Package" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "### uncomment this line the first time you run this code\n", "# ] add https://github.com/timholy/ProgressMeter.jl.git ;" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "using ProgressMeter\n", "ProgressMeter.ijulia_behavior(:clear);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load the Train/Test Data/Labels" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "X_train, Y_train = FashionMNIST.traindata(Float64);\n", "X_test, Y_test = FashionMNIST.testdata(Float64);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's Prepare The Data/Labels" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "X_train ./= 255\n", "X_test ./=255\n", "Y_train = oneHot(Y_train)\n", "Y_test = oneHot(Y_test);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### It's Time fot The Layers " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "X_Input = Input(X_train) #or Input(size(X_train))\n", "X = Flatten()(X_Input)\n", "X = FCLayer(120, :relu)(X)\n", "X_Output = FCLayer(10, :softmax)(X);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way when there is no side branches is to use the `chain` function as follows:\n", "\n", "```julia\n", "X_Input, X_Ouput = chain(X_train,[Flatten(),FCLayer(120,:relu),FCLayer(10,:softmax)]);\n", "```\n", "\n", "`chain` returns a `Tuple` of two pointers of the Input `Layer` and Output `Layer`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define the Model " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will also initialize the `Layer`s' parameters" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "model = Model(X_train,Y_train,X_Input,X_Output, 0.005; optimizer=:adam);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Let's use `predict` to see the current Accuracy" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:05\u001b[39m\n", "\u001b[34m Instances 10000: 10000\u001b[39m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "The accuracy of Test Data before the training process 0.0634\n", "The cost of Test Data before the training process 2.3028\n" ] } ], "source": [ "TestP = predict(model, X_test, Y_test);\n", "\n", "println()\n", "println(\"The accuracy of Test Data before the training process $(round(TestP[:accuracy], digits=4))\")\n", "println(\"The cost of Test Data before the training process $(round(TestP[:cost], digits=4))\")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:06\u001b[39m\n", "\u001b[34m Instances 60000: 60000\u001b[39m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "The accuracy of Train Data before the training process 0.0612\n", "The cost of Train Data before the training process 2.3028\n" ] } ], "source": [ "TrainP = predict(model, X_train, Y_train);\n", "\n", "println()\n", "println(\"The accuracy of Train Data before the training process $(round(TrainP[:accuracy], digits=4))\")\n", "println(\"The cost of Train Data before the training process $(round(TrainP[:cost], digits=4))\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train the model" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:56\u001b[39m\n", "\u001b[34m Epoch 10: 10\u001b[39m\n", "\u001b[34m Instances 60000: 60000\u001b[39m\n", "\u001b[34m Train Cost: 0.3447\u001b[39m\n", "\u001b[34m Train Accuracy: 0.8757\u001b[39m\n" ] } ], "source": [ "TrainD = train(X_train, Y_train, model, 10);# testData = X_test, testLabels = Y_test);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`train` function provides an extra `kwargs` to use test Data/Labels to get the Costs and Accuracies during each training epochs. \n", "\n", "**Note** This will take extra time to do the training\n", "\n", "Instead it can be used as follows:\n", "\n", "```julia\n", "TrainD = train(X_train, Y_train, model, 10)\n", "```" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot(1:10, TrainD[:trainAccuracies], label=\"Training Accuracies\")\n", "plot!(1:10, TrainD[:trainCosts], label=\"Training Costs\")\n", "# plot!(1:10, TrainD[:testAccuracies], label=\"Test Accuracies\")\n", "# plot!(1:10, TrainD[:testCosts], label=\"Test Costs\")\n", "ylabel!(\"Epochs\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Predict After Training" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:06\u001b[39m\n", "\u001b[34m Instances 60000: 60000\u001b[39m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "The accuracy of Train Data before the training process 0.8812\n", "The cost of Train Data before the training process 0.3314\n" ] } ], "source": [ "TrainP = predict(model, X_train, Y_train);\n", "\n", "println()\n", "println(\"The accuracy of Train Data before the training process $(round(TrainP[:accuracy], digits=4))\")\n", "println(\"The cost of Train Data before the training process $(round(TrainP[:cost], digits=4))\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:01\u001b[39m\n", "\u001b[34m Instances 10000: 10000\u001b[39m\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "The accuracy of Test Data before the training process 0.8629\n", "The cost of Test Data before the training process 0.3845\n" ] } ], "source": [ "TestP = predict(model, X_test, Y_test);\n", "\n", "println()\n", "println(\"The accuracy of Test Data before the training process $(round(TestP[:accuracy], digits=4))\")\n", "println(\"The cost of Test Data before the training process $(round(TestP[:cost], digits=4))\")" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.4.1", "language": "julia", "name": "julia-1.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.1" } }, "nbformat": 4, "nbformat_minor": 4 }