{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basics of deep learning and neural networks\n", "> In this chapter, you'll become familiar with the fundamental concepts and terminology used in deep learning, and understand why deep learning techniques are so powerful today. You'll build simple neural networks and generate predictions with them. This is the Summary of lecture \"Introduction to Deep Learning in Python\", via datacamp.\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Datacamp, Deep_Learning]\n", "- image: images/lnl.png" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basics of deep learning and neural networks\n", "- Interactions\n", " - Neural Networks account for interactions really well\n", " - Deep Learning uses especially powerful neural networks\n", " - Text, Images, Videos, Audio, Source Code, etc.." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forward propagation\n", "- Forward propagation\n", " - Multiply - add process\n", " - Dot product\n", " - Forward propagation for one data at a time\n", " - Output is the prediction for that data point" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Coding the forward propagation algorithm\n", "In this exercise, you'll write code to do forward propagation (prediction) for your first neural network:\n", "\n", "![fp](image/1_4.png)\n", "\n", "Each data point is a customer. The first input is how many accounts they have, and the second input is how many children they have. The model will predict how many transactions the user makes in the next year. \n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "input_data = np.array([3, 5])\n", "weights = {'node_0': np.array([2, 4]), \n", " 'node_1': np.array([ 4, -5]), \n", " 'output': np.array([2, 7])}" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-39\n" ] } ], "source": [ "# Calculate node 0 value: node_0_value\n", "node_0_value = (input_data * weights['node_0']).sum()\n", "\n", "# Calculate node 1 value: node_1_value\n", "node_1_value = (input_data * weights['node_1']).sum()\n", "\n", "# Put node values into array: hidden_layer_outputs\n", "hidden_layer_outputs = np.array([node_0_value, node_1_value])\n", "\n", "# Calculate output: output\n", "output = (hidden_layer_outputs * weights['output']).sum()\n", "\n", "# Print output\n", "print(output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Activation functions\n", "- Linear vs. Nonlinear Functions\n", "![lnl](image/lnl.png)\n", "- Activation function\n", " - Applied to node inputs to produce node output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Rectified Linear Activation Function\n", "An **\"activation function\"** is a function applied at each node. It converts the node's input into some output.\n", "\n", "The rectified linear activation function (called ReLU) has been shown to lead to very high-performance networks. This function takes a single number as an input, returning 0 if the input is negative, and the input if the input is positive." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "52\n" ] } ], "source": [ "def relu(input):\n", " '''Define your relu activatino function here'''\n", " # Calculate the value for the output of the relu function: output\n", " output = max(0, input)\n", " \n", " # Return the value just calculate\n", " return output\n", "\n", "# Calculate node 0 value: node_0_output\n", "node_0_input = (input_data * weights['node_0']).sum()\n", "node_0_output = relu(node_0_input)\n", "\n", "# Calculate node 1 value: node_1_output\n", "node_1_input = (input_data * weights['node_1']).sum()\n", "node_1_output = relu(node_1_input)\n", "\n", "# Put node values into array: hidden_layer_outputs\n", "hidden_layer_outputs = np.array([node_0_output, node_1_output])\n", "\n", "# Calculate model output (do not apply relu)\n", "model_output = (hidden_layer_outputs * weights['output']).sum()\n", "\n", "# Print model output\n", "print(model_output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Applying the network to many observations/rows of data\n", "You'll now define a function called `predict_with_network()` which will generate predictions for multiple data observations, which are pre-loaded as input_data. As before, `weights` are also pre-loaded. In addition, the `relu()` function you defined in the previous exercise has been pre-loaded.\n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "input_data = [np.array([3, 5]), np.array([ 1, -1]), \n", " np.array([0, 0]), np.array([8, 4])]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[52, 63, 0, 148]\n" ] } ], "source": [ "# Define predict_with_network()\n", "def predict_with_network(input_data_row, weights):\n", " # Calculate node 0 value\n", " node_0_input = (input_data_row * weights['node_0']).sum()\n", " node_0_output = relu(node_0_input)\n", " \n", " # Calculate node 1 value\n", " node_1_input = (input_data_row * weights['node_1']).sum()\n", " node_1_output = relu(node_1_input)\n", " \n", " # Put node values into array: hidden_layer_outputs\n", " hidden_layer_outputs = np.array([node_0_output, node_1_output])\n", " \n", " # Calculate model output\n", " input_to_final_layer = (hidden_layer_outputs * weights['output']).sum()\n", " model_output = relu(input_to_final_layer)\n", " \n", " # Return model output\n", " return(model_output)\n", "\n", "# Create empty list to store prediction results\n", "results = []\n", "for input_data_row in input_data:\n", " # Append prediction to results\n", " results.append(predict_with_network(input_data_row, weights))\n", " \n", "# Print results\n", "print(results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deeper networks\n", "- Representation learning\n", " - Deep networks internally build representations of patterns in the data\n", " - Partially replace the need for feature engnerring\n", " - Subsequent layers build increasingly sophisticated representatios of raw data\n", "- Deep learning\n", " - Modeler doesn't need to specify the interactions\n", " - When you train the model, the neural network gets weights that find the relevant patterns to make better predictions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Multi-layer neural networks\n", "In this exercise, you'll write code to do forward propagation for a neural network with 2 hidden layers. Each hidden layer has two nodes. The input data has been preloaded as `input_data`. The nodes in the first hidden layer are called `node_0_0` and `node_0_1`. Their weights are pre-loaded as `weights['node_0_0']` and `weights['node_0_1']` respectively.\n", "\n", "The nodes in the second hidden layer are called `node_1_0` and `node_1_1`. Their weights are pre-loaded as `weights['node_1_0']` and `weights['node_1_1']` respectively.\n", "\n", "We then create a model output from the hidden nodes using weights pre-loaded as `weights['output']`.\n", "![mlnn](image/ch1ex10.png)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "input_data = np.array([3, 5])\n", "weights = {'node_0_0': np.array([2, 4]),\n", " 'node_0_1': np.array([ 4, -5]),\n", " 'node_1_0': np.array([-1, 2]),\n", " 'node_1_1': np.array([1, 2]),\n", " 'output': np.array([2, 7])}\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "182\n" ] } ], "source": [ "def predict_with_network(input_data):\n", " # Calculate node 0 in the first hidden layer\n", " node_0_0_input = (input_data * weights['node_0_0']).sum()\n", " node_0_0_output = relu(node_0_0_input)\n", " \n", " # Calculate node 1 in the first hidden layer\n", " node_0_1_input = (input_data * weights['node_0_1']).sum()\n", " node_0_1_output = relu(node_0_1_input)\n", " \n", " # Put node values into array: hidden_0_outputs\n", " hidden_0_outputs = np.array([node_0_0_output, node_0_1_output])\n", " \n", " # Calculate node 0 in the second hidden layer\n", " node_1_0_input = (hidden_0_outputs * weights['node_1_0']).sum()\n", " node_1_0_output = relu(node_1_0_input)\n", " \n", " # Calculate node 1 in the second hidden layer\n", " node_1_1_input = (hidden_0_outputs * weights['node_1_1']).sum()\n", " node_1_1_output = relu(node_1_1_input)\n", " \n", " # Put node values into array: hidden_1_outputs\n", " hidden_1_outputs = np.array([node_1_0_output, node_1_1_output])\n", " \n", " # Calculate model output: model_output\n", " model_output = (hidden_1_outputs * weights['output']).sum()\n", " \n", " # Return model_output\n", " return model_output\n", "\n", "output = predict_with_network(input_data)\n", "print(output)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }