{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 4.3 - Solution\n", "## Classification\n", "In the following tasks, we will repeatedly use some basic functions (e.g., the softmax function or the cross-entropy) of the [Keras](https://keras.io/) Library. To familiarize with them, we will implement the most important of them ourselves in this task.\n", "\n", "Suppose we want to classify some data (4 samples) into 3 distinct classes: 0, 1, and 2.\n", "We have set up a network with a pre-activation output `z` in the last layer.\n", "Applying softmax will give the final model output.\n", "\n", "input X ---> some network --> `z`
\n", "--> `y_model = softmax(z)`\n", "\n", "We quantify the agreement between truth (y) and model using categorical cross-entropy.\n", "\n", "$$J = - \\sum_i (y_i * \\log(y_\\mathrm{model}(x_i))$$\n", "\n", "In the following you are to implement softmax and categorical cross-entropy\n", "and evaluate them values given the values for `z`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Data: 4 samples with the following class labels (input features X irrelevant here)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "y_cl = np.array([0, 0, 2, 1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### output of the last network layer before applying softmax" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "z = np.array([\n", " [4, 5, 1],\n", " [-1, -2, -3],\n", " [0.1, 0.2, 0.3],\n", " [-1, 17, 1]\n", " ]).astype(np.float32)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Task 1)\n", "Write a function that turns any class labels `y_cl` into one-hot encodings `y`.\n", "\n", "0 --> (1, 0, 0)
\n", "1 --> (0, 1, 0)
\n", "2 --> (0, 0, 1)
\n", "\n", "Make sure that `np.shape(y) = (4, 3)` for `np.shape(y_cl) = (4)`.\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "one-hot encoding of data labels\n", "[[1. 0. 0.]\n", " [1. 0. 0.]\n", " [0. 0. 1.]\n", " [0. 1. 0.]]\n" ] } ], "source": [ "def to_onehot(y_cl, num_classes):\n", " y = np.zeros((len(y_cl), num_classes))\n", " y[np.arange(4), y_cl] = 1\n", " return y\n", "\n", "y = to_onehot(y_cl, num_classes=3)\n", "print('one-hot encoding of data labels')\n", "print(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Task 2)\n", "Write a function that returns the softmax of the input `z` along the last axis" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "softmax(z)\n", "[[2.6538792e-01 7.2139925e-01 1.3212887e-02]\n", " [6.6524100e-01 2.4472848e-01 9.0030573e-02]\n", " [3.0060962e-01 3.3222499e-01 3.6716542e-01]\n", " [1.5229979e-08 9.9999994e-01 1.1253517e-07]]\n" ] } ], "source": [ "def softmax(z):\n", " expz = np.exp(z).T\n", " return (expz / np.sum(expz, axis=0)).T\n", "\n", "y_model = softmax(z)\n", "print('softmax(z)')\n", "print(y_model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Task 3)\n", "Compute the categorical cross-entropy between data and model" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cross entropy = 0.684028\n" ] } ], "source": [ "crossentropy = -np.mean(np.sum(y * np.log(y_model), axis=1))\n", "crossentropy = -np.mean(np.log(y_model[np.arange(4), y_cl])) # alternative formulation\n", "print('cross entropy = %f' % crossentropy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Task 4)\n", "Determine which calsses are predicted by the model (maximum prediction)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "true class labels = [0 0 2 1]\n", "predicted class labels = [1 0 2 1]\n" ] } ], "source": [ "y_model_cl = np.argmax(y_model, axis=1)\n", "print('\\ntrue class labels = ', y_cl)\n", "print('predicted class labels =', y_model_cl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Task 5)\n", "Estimate how many samples are classified correctly (accuracy)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy = 0.75\n" ] } ], "source": [ "accuracy = np.mean(y_model_cl == y_cl)\n", "print('accuracy = %.2f' % accuracy)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }