{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "DGPlYumZnO1t" }, "source": [ "# Manual Hyperparameter Tuning a Neural Network\n", "\n", "## Introduction\n", "In Challenge 4, we built a simple neural network using Keras. In this challenge, we will manually experiment with hyperarameters.\n", "\n", "This notebook contains the same workflow. Step 2 is the section that shows the simple 3-layer neural network from chapter 4. Section three is for you to change the same neural network by adding additional layers, neurons, and changing the number of epochs - then running the plot graph cell to see the results.\n", "\n", "NOTE: Be sure to re-run the begining cells before working on Section 3.\n", "\n", "## Dataset\n", "The advertising dataset captures the sales revenue generated with respect to advertisement costs across numerous platforms like radio, TV, and newspapers.\n", "\n", "### Features:\n", "\n", "#### Digital: advertising dollars spent on Internet.\n", "#### TV: advertising dollars spent on TV.\n", "#### Radio: advertising dollars spent on Radio.\n", "#### Newspaper: advertising dollars spent on Newspaper.\n", "\n", "### Target (Label):\n", "#### Sales budget" ] }, { "cell_type": "markdown", "metadata": { "id": "YBXAUbijjNBF" }, "source": [ "# Step 1: Data Preparation" ] }, { "cell_type": "markdown", "metadata": { "id": "AsHg6SD2nO1v" }, "source": [ "### Import Libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "gEXV-RxPnO1w" }, "outputs": [], "source": [ "# Import the necessary libraries\n", "\n", "# For Data loading, Exploraotry Data Analysis, Graphing\n", "import pandas as pd # Pandas for data processing libraries\n", "import numpy as np # Numpy for mathematical functions\n", "\n", "import matplotlib.pyplot as plt # Matplotlib for visualization tasks\n", "import seaborn as sns # Seaborn for data visualization library based on matplotlib.\n", "%matplotlib inline\n", "\n", "import sklearn # ML tasks\n", "from sklearn.model_selection import train_test_split # Split the dataset\n", "from sklearn.metrics import mean_squared_error # Calculate Mean Squared Error\n", "\n", "# Build the Network\n", "from tensorflow import keras\n", "from keras.models import Sequential\n", "#from tensorflow.keras.models import Sequential\n", "from keras.layers import Dense\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "wSDtqpd-Dk9H" }, "outputs": [], "source": [ "# Next, you read the dataset into a Pandas dataframe.\n", "\n", "url = 'https://github.com/LinkedInLearning/artificial-intelligence-foundations-neural-networks-4381282/blob/main/Advertising_2023.csv?raw=true'\n", "advertising_df= pd.read_csv(url,index_col=0)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "coiKjFmcnaHU" }, "outputs": [], "source": [ "# Pandas info() function is used to get a concise summary of the dataframe.\n", "advertising_df.info()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "BTNIhkVYv9HZ" }, "outputs": [], "source": [ "### Get summary of statistics of the data\n", "advertising_df.describe()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "IzyfOhGaEzlL" }, "outputs": [], "source": [ "#shape of dataframe - 1199 rows, five columns\n", "advertising_df.shape" ] }, { "cell_type": "markdown", "metadata": { "id": "6nNrSEBnBk71" }, "source": [ "Let's check for any null values." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "P7gHdfMdBk71" }, "outputs": [], "source": [ "# The isnull() method is used to check and manage NULL values in a data frame.\n", "advertising_df.isnull().sum()" ] }, { "cell_type": "markdown", "metadata": { "id": "QWVdsrmgnO1_" }, "source": [ "## Exploratory Data Analysis (EDA)\n", "\n", "Let's create some simple plots to check out the data! " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "i8bsyyVPm13-" }, "outputs": [], "source": [ "## Plot the heatmap so that the values are shown.\n", "\n", "plt.figure(figsize=(10,5))\n", "sns.heatmap(advertising_df.corr(),annot=True,vmin=0,vmax=1,cmap='ocean')\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8UdKVi-2tNKI" }, "outputs": [], "source": [ "#create a correlation matrix\n", "corr = advertising_df.corr()\n", "plt.figure(figsize=(10, 5))\n", "sns.heatmap(corr[(corr >= 0.5) | (corr <= -0.7)],\n", " cmap='viridis', vmax=1.0, vmin=-1.0, linewidths=0.1,\n", " annot=True, annot_kws={\"size\": 8}, square=True)\n", "plt.tight_layout()\n", "display(plt.show())" ] }, { "cell_type": "code", "source": [ "advertising_df.corr()" ], "metadata": { "id": "X1tn4RQYA7o-" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "WKFX9SCauZp4" }, "outputs": [], "source": [ "### Visualize Correlation\n", "\n", "# Generate a mask for the upper triangle\n", "mask = np.zeros_like(advertising_df.corr(), dtype=np.bool)\n", "mask[np.triu_indices_from(mask)] = True\n", "\n", "# Set up the matplotlib figure\n", "f, ax = plt.subplots(figsize=(11, 9))\n", "\n", "# Generate a custom diverging colormap\n", "cmap = sns.diverging_palette(220, 10, as_cmap=True)\n", "\n", "# Draw the heatmap with the mask and correct aspect ratio\n", "sns.heatmap(advertising_df.corr(), mask=mask, cmap=cmap, vmax=.9, square=True, linewidths=.5, ax=ax)" ] }, { "cell_type": "markdown", "metadata": { "id": "2eDQ9CNw3uej" }, "source": [ "Since Sales is our target variable, we should identify which variable correlates the most with Sales.\n", "\n", "As we can see, TV has the highest correlation with Sales.\n", "Let's visualize the relationship of variables using scatterplots." ] }, { "cell_type": "markdown", "metadata": { "id": "3f7bN--N5TYF" }, "source": [ "Rather than plot them separately, an efficient way to view the linear relationsips between variables is to use a \"for loop\" that plots all of the features at once.\n", "\n", "It seems there's no clear linear relationships between the predictors.\n", "\n", "At this point, we know that the variable TV will more likely give better prediction of Sales because of the high correlation and linearity of the two." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "otYqhhMapHhY" }, "outputs": [], "source": [ "'''=== Show the linear relationship between features and sales Thus, it provides that how the scattered\n", " they are and which features has more impact in prediction of house price. ==='''\n", "\n", "# visiualize all variables with sales\n", "from scipy import stats\n", "#creates figure\n", "plt.figure(figsize=(18, 18))\n", "\n", "for i, col in enumerate(advertising_df.columns[0:13]): #iterates over all columns except for price column (last one)\n", " plt.subplot(5, 3, i+1) # each row three figure\n", " x = advertising_df[col] #x-axis\n", " y = advertising_df['sales'] #y-axis\n", " plt.plot(x, y, 'o')\n", "\n", " # Create regression line\n", " plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1)) (np.unique(x)), color='red')\n", " plt.xlabel(col) # x-label\n", " plt.ylabel('sales') # y-label\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ThNRxdAmA5fQ" }, "source": [ "Concluding results after observing the Graph\n", "The relation bw TV and Sales is stong and increases in linear fashion\n", "The relation bw Radio and Sales is less stong\n", "The relation bw TV and Sales is weak" ] }, { "cell_type": "markdown", "metadata": { "id": "OIPKB4hanO2F" }, "source": [ "## Training a Linear Regression Model\n", "\n", "Regression is a supervised machine learning process. It is similar to classification, but rather than predicting a label, you try to predict a continuous value. Linear regression defines the relationship between a target variable (y) and a set of predictive features (x). Simply stated, If you need to predict a number, then use regression.\n", "\n", "Let's now begin to train your regression model! You will need to first split up your data into an X array that contains the features to train on, and a y array with the target variable, in this case the Price column. You will toss out the Address column because it only has text info that the linear regression model can't use." ] }, { "cell_type": "markdown", "metadata": { "id": "KFDhhs1KA22t" }, "source": [ "#### Data Preprocessing" ] }, { "cell_type": "markdown", "metadata": { "id": "l0NjRPeFBk74" }, "source": [ "##### Split: X (features) and y (target)\n", "Next, let's define the features and label. Briefly, feature is input; label is output. This applies to both classification and regression problems." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "id": "ZEEGuBAnnO2F" }, "outputs": [], "source": [ "X = advertising_df[['digital', 'TV', 'radio', 'newspaper']]\n", "y = advertising_df['sales']\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "lXgBbCFcBCfG" }, "source": [ "##### Scaling (Normalization)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "hVOPQyd2HtxP" }, "outputs": [], "source": [ "'''=== Noramlization the features. Since it is seen that features have different ranges, it is best practice to\n", "normalize/standarize the feature before using them in the model ==='''\n", "\n", "#feature normalization\n", "normalized_feature = keras.utils.normalize(X.values)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "X97FWdDOnO2H" }, "source": [ "##### Train - Test - Split\n", "\n", "Now let's split the data into a training and test set. Note: Best pracices is to split into three - training, validation, and test set.\n", "\n", "By default - It splits the given data into 75-25 ratio\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "id": "fS99Llq8nO2J" }, "outputs": [], "source": [ "# Import train_test_split function from sklearn.model_selection\n", "from sklearn.model_selection import train_test_split\n", "\n", "# Split up the data into a training set\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "id": "K4p_3UGEcqPb", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "3d6bd4ed-0ab6-4c75-8315-e56f0b50e077" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "(719, 4) (480, 4) (719,) (480,)\n" ] } ], "source": [ "print(X_train.shape,X_test.shape, y_train.shape, y_test.shape )" ] }, { "cell_type": "markdown", "metadata": { "id": "mdc8pL4TIb9k" }, "source": [ "# Step 2: Build Network\n" ] }, { "cell_type": "markdown", "metadata": { "id": "J10hkM7BAhbh" }, "source": [ "#### Build and Train the Network" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "K-_Vage0th27" }, "outputs": [], "source": [ "## Build Model (Building a three layer network - with one hidden layer)\n", "model = Sequential()\n", "model.add(Dense(4,input_dim=4, activation='relu')) # You don't have to specify input size.Just define the hidden layers\n", "model.add(Dense(3,activation='relu'))\n", "model.add(Dense(1))\n", "\n", "# Compile Model\n", "model.compile(optimizer='adam', loss='mse',metrics=['mse'])\n", "\n", "# Fit the Model\n", "history = model.fit(X_train, y_train, validation_data = (X_test, y_test),\n", " epochs = 32)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "RHBzoonKnFfj" }, "source": [ "### Visualization" ] }, { "cell_type": "markdown", "metadata": { "id": "ktMqgjoDGAe7" }, "source": [ "You can add more 'flavor' to the graph by making it bigger and adding labels and names, as shown below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "jB1VUt-bmTPH" }, "outputs": [], "source": [ "## Plot a graph of model loss # show the graph of model loss in trainig and validation\n", "\n", "plt.figure(figsize=(15,8))\n", "plt.plot(history.history['loss'])\n", "plt.plot(history.history['val_loss'])\n", "plt.title('Model Loss (MSE) on Training and Validation Data')\n", "plt.ylabel('Loss-Mean Squred Error')\n", "plt.xlabel('Epoch')\n", "plt.legend(['Val Loss', 'Train Loss'], loc='upper right')\n", "plt.show()\n", "\n" ] }, { "cell_type": "markdown", "source": [ "# Step 3: Tune the Neural Network Hyperparameters" ], "metadata": { "id": "qja3d7BkyaKP" } }, { "cell_type": "markdown", "source": [ "## CHALLENGE: Manual Hyperparameter Tuning" ], "metadata": { "id": "fOikBhQYwBhG" } }, { "cell_type": "markdown", "source": [ "Play with:\n", "1. Adding additional layers\n", "2. Adding additional neurons\n", "3. Change the number of epochs.\n", "\n", "NOTE: After each change, run the cell to plot the graph and review the loss curves." ], "metadata": { "id": "TH4MgbsVyRhB" } }, { "cell_type": "code", "source": [ "## Build Model\n", "model = Sequential()\n", "model.add(Dense(4,input_dim=4, activation='relu'))\n", "model.add(Dense(3,activation='relu'))\n", "model.add(Dense(1))\n", "\n", "# Compile Model\n", "model.compile(optimizer='adam', loss='mse',metrics=['mse'])\n", "\n", "# Fit the Model\n", "history = model.fit(X_train, y_train, validation_data = (X_test, y_test),\n", " epochs = 32)" ], "metadata": { "id": "C-urqQicwAdn" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "plt.figure(figsize=(15,8))\n", "plt.plot(history.history['loss'])\n", "plt.plot(history.history['val_loss'])\n", "plt.title('Model Loss (MSE) on Training and Validation Data')\n", "plt.ylabel('Loss-Mean Squred Error')\n", "plt.xlabel('Epoch')\n", "plt.legend(['Val Loss', 'Train Loss'], loc='upper right')\n", "plt.show()" ], "metadata": { "id": "GLCPzbzYxNll" }, "execution_count": null, "outputs": [] } ], "metadata": { "colab": { "provenance": [] }, "environment": { "kernel": "python3", "name": "tf2-gpu.2-6.m87", "type": "gcloud", "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-6:m87" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }