{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "first_steps_with_tensor_flow.ipynb", "version": "0.3.2", "provenance": [], "collapsed_sections": [ "JndnmDMp66FL", "ajVM7rkoYXeL", "ci1ISxxrZ7v0" ], "include_colab_link": true }, "kernelspec": { "name": "python2", "display_name": "Python 2" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "4f3CKqFUqL2-", "colab_type": "text" }, "source": [ "## টেনসর ফ্লো দিয়ে প্রথম ধাপ, গুগল মেশিন লার্নিং ক্র্যাশ কোর্স থেকে " ] }, { "cell_type": "markdown", "metadata": { "id": "Bd2Zkk1LE2Zr", "colab_type": "text" }, "source": [ "**Learning Objectives:**\n", " * Learn fundamental TensorFlow concepts\n", " * Use the `LinearRegressor` class in TensorFlow to predict median housing price, at the granularity of city blocks, based on one input feature\n", " * Evaluate the accuracy of a model's predictions using Root Mean Squared Error (RMSE)\n", " * Improve the accuracy of a model by tuning its hyperparameters" ] }, { "cell_type": "markdown", "metadata": { "id": "MxiIKhP4E2Zr", "colab_type": "text" }, "source": [ "The [data](https://developers.google.com/machine-learning/crash-course/california-housing-data-description) is based on 1990 census data from California." ] }, { "cell_type": "markdown", "metadata": { "id": "6TjLjL9IU80G", "colab_type": "text" }, "source": [ "## Setup\n", "In this first cell, we'll load the necessary libraries." ] }, { "cell_type": "code", "metadata": { "id": "rVFf5asKE2Zt", "colab_type": "code", "colab": {} }, "source": [ "from __future__ import print_function\n", "\n", "import math\n", "\n", "from IPython import display\n", "from matplotlib import cm\n", "from matplotlib import gridspec\n", "from matplotlib import pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "from sklearn import metrics\n", "import tensorflow as tf\n", "from tensorflow.python.data import Dataset\n", "\n", "tf.logging.set_verbosity(tf.logging.ERROR)\n", "pd.options.display.max_rows = 10\n", "pd.options.display.float_format = '{:.1f}'.format" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "ipRyUHjhU80Q", "colab_type": "text" }, "source": [ "Next, we'll load our data set." ] }, { "cell_type": "code", "metadata": { "id": "9ivCDWnwE2Zx", "colab_type": "code", "colab": {} }, "source": [ "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "vVk_qlG6U80j", "colab_type": "text" }, "source": [ "We'll randomize the data, just to be sure not to get any pathological ordering effects that might harm the performance of Stochastic Gradient Descent. Additionally, we'll scale `median_house_value` to be in units of thousands, so it can be learned a little more easily with learning rates in a range that we usually use." ] }, { "cell_type": "code", "metadata": { "id": "r0eVyguIU80m", "colab_type": "code", "colab": {} }, "source": [ "california_housing_dataframe = california_housing_dataframe.reindex(\n", " np.random.permutation(california_housing_dataframe.index))\n", "california_housing_dataframe[\"median_house_value\"] /= 1000.0\n", "california_housing_dataframe" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "HzzlSs3PtTmt", "colab_type": "text" }, "source": [ "## Examine the Data\n", "\n", "It's a good idea to get to know your data a little bit before you work with it.\n", "\n", "We'll print out a quick summary of a few useful statistics on each column: count of examples, mean, standard deviation, max, min, and various quantiles." ] }, { "cell_type": "code", "metadata": { "id": "gzb10yoVrydW", "colab_type": "code", "cellView": "both", "colab": {} }, "source": [ "california_housing_dataframe.describe()" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "Lr6wYl2bt2Ep", "colab_type": "text" }, "source": [ "## Build the First Model\n", "\n", "In this exercise, we'll try to predict `median_house_value`, which will be our label (sometimes also called a target). We'll use `total_rooms` as our input feature.\n", "\n", "**NOTE:** Our data is at the city block level, so this feature represents the total number of rooms in that block.\n", "\n", "To train our model, we'll use the [LinearRegressor](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearRegressor) interface provided by the TensorFlow [Estimator](https://www.tensorflow.org/get_started/estimator) API. This API takes care of a lot of the low-level model plumbing, and exposes convenient methods for performing model training, evaluation, and inference." ] }, { "cell_type": "markdown", "metadata": { "id": "0cpcsieFhsNI", "colab_type": "text" }, "source": [ "### Step 1: Define Features and Configure Feature Columns" ] }, { "cell_type": "markdown", "metadata": { "id": "EL8-9d4ZJNR7", "colab_type": "text" }, "source": [ "In order to import our training data into TensorFlow, we need to specify what type of data each feature contains. There are two main types of data we'll use in this and future exercises:\n", "\n", "* **Categorical Data**: Data that is textual. In this exercise, our housing data set does not contain any categorical features, but examples you might see would be the home style, the words in a real-estate ad.\n", "\n", "* **Numerical Data**: Data that is a number (integer or float) and that you want to treat as a number. As we will discuss more later sometimes you might want to treat numerical data (e.g., a postal code) as if it were categorical.\n", "\n", "In TensorFlow, we indicate a feature's data type using a construct called a **feature column**. Feature columns store only a description of the feature data; they do not contain the feature data itself.\n", "\n", "To start, we're going to use just one numeric input feature, `total_rooms`. The following code pulls the `total_rooms` data from our `california_housing_dataframe` and defines the feature column using `numeric_column`, which specifies its data is numeric:" ] }, { "cell_type": "code", "metadata": { "id": "rhEbFCZ86cDZ", "colab_type": "code", "colab": {} }, "source": [ "# Define the input feature: total_rooms.\n", "my_feature = california_housing_dataframe[[\"total_rooms\"]]\n", "\n", "# Configure a numeric feature column for total_rooms.\n", "feature_columns = [tf.feature_column.numeric_column(\"total_rooms\")]" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "K_3S8teX7Rd2", "colab_type": "text" }, "source": [ "**NOTE:** The shape of our `total_rooms` data is a one-dimensional array (a list of the total number of rooms for each block). This is the default shape for `numeric_column`, so we don't have to pass it as an argument." ] }, { "cell_type": "markdown", "metadata": { "id": "UMl3qrU5MGV6", "colab_type": "text" }, "source": [ "### Step 2: Define the Target" ] }, { "cell_type": "markdown", "metadata": { "id": "cw4nrfcB7kyk", "colab_type": "text" }, "source": [ "Next, we'll define our target, which is `median_house_value`. Again, we can pull it from our `california_housing_dataframe`:" ] }, { "cell_type": "code", "metadata": { "id": "l1NvvNkH8Kbt", "colab_type": "code", "colab": {} }, "source": [ "# Define the label.\n", "targets = california_housing_dataframe[\"median_house_value\"]" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "4M-rTFHL2UkA", "colab_type": "text" }, "source": [ "### Step 3: Configure the LinearRegressor" ] }, { "cell_type": "markdown", "metadata": { "id": "fUfGQUNp7jdL", "colab_type": "text" }, "source": [ "Next, we'll configure a linear regression model using LinearRegressor. We'll train this model using the `GradientDescentOptimizer`, which implements Mini-Batch Stochastic Gradient Descent (SGD). The `learning_rate` argument controls the size of the gradient step.\n", "\n", "**NOTE:** To be safe, we also apply [gradient clipping](https://developers.google.com/machine-learning/glossary/#gradient_clipping) to our optimizer via `clip_gradients_by_norm`. Gradient clipping ensures the magnitude of the gradients do not become too large during training, which can cause gradient descent to fail. " ] }, { "cell_type": "code", "metadata": { "id": "ubhtW-NGU802", "colab_type": "code", "colab": {} }, "source": [ "# Use gradient descent as the optimizer for training the model.\n", "my_optimizer=tf.train.GradientDescentOptimizer(learning_rate=0.0000001)\n", "my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", "\n", "# Configure the linear regression model with our feature columns and optimizer.\n", "# Set a learning rate of 0.0000001 for Gradient Descent.\n", "linear_regressor = tf.estimator.LinearRegressor(\n", " feature_columns=feature_columns,\n", " optimizer=my_optimizer\n", ")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "-0IztwdK2f3F", "colab_type": "text" }, "source": [ "### Step 4: Define the Input Function" ] }, { "cell_type": "markdown", "metadata": { "id": "S5M5j6xSCHxx", "colab_type": "text" }, "source": [ "To import our California housing data into our `LinearRegressor`, we need to define an input function, which instructs TensorFlow how to preprocess\n", "the data, as well as how to batch, shuffle, and repeat it during model training.\n", "\n", "First, we'll convert our *pandas* feature data into a dict of NumPy arrays. We can then use the TensorFlow [Dataset API](https://www.tensorflow.org/programmers_guide/datasets) to construct a dataset object from our data, and then break\n", "our data into batches of `batch_size`, to be repeated for the specified number of epochs (num_epochs). \n", "\n", "**NOTE:** When the default value of `num_epochs=None` is passed to `repeat()`, the input data will be repeated indefinitely.\n", "\n", "Next, if `shuffle` is set to `True`, we'll shuffle the data so that it's passed to the model randomly during training. The `buffer_size` argument specifies\n", "the size of the dataset from which `shuffle` will randomly sample.\n", "\n", "Finally, our input function constructs an iterator for the dataset and returns the next batch of data to the LinearRegressor." ] }, { "cell_type": "code", "metadata": { "id": "RKZ9zNcHJtwc", "colab_type": "code", "colab": {} }, "source": [ "def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):\n", " \"\"\"Trains a linear regression model of one feature.\n", " \n", " Args:\n", " features: pandas DataFrame of features\n", " targets: pandas DataFrame of targets\n", " batch_size: Size of batches to be passed to the model\n", " shuffle: True or False. Whether to shuffle the data.\n", " num_epochs: Number of epochs for which data should be repeated. None = repeat indefinitely\n", " Returns:\n", " Tuple of (features, labels) for next data batch\n", " \"\"\"\n", " \n", " # Convert pandas data into a dict of np arrays.\n", " features = {key:np.array(value) for key,value in dict(features).items()} \n", " \n", " # Construct a dataset, and configure batching/repeating.\n", " ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit\n", " ds = ds.batch(batch_size).repeat(num_epochs)\n", " \n", " # Shuffle the data, if specified.\n", " if shuffle:\n", " ds = ds.shuffle(buffer_size=10000)\n", " \n", " # Return the next batch of data.\n", " features, labels = ds.make_one_shot_iterator().get_next()\n", " return features, labels" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "wwa6UeA1V5F_", "colab_type": "text" }, "source": [ "**NOTE:** We'll continue to use this same input function in later exercises. For more\n", "detailed documentation of input functions and the `Dataset` API, see the [TensorFlow Programmer's Guide](https://www.tensorflow.org/programmers_guide/datasets)." ] }, { "cell_type": "markdown", "metadata": { "id": "4YS50CQb2ooO", "colab_type": "text" }, "source": [ "### Step 5: Train the Model" ] }, { "cell_type": "markdown", "metadata": { "id": "yP92XkzhU803", "colab_type": "text" }, "source": [ "We can now call `train()` on our `linear_regressor` to train the model. We'll wrap `my_input_fn` in a `lambda`\n", "so we can pass in `my_feature` and `targets` as arguments (see this [TensorFlow input function tutorial](https://www.tensorflow.org/get_started/input_fn#passing_input_fn_data_to_your_model) for more details), and to start, we'll\n", "train for 100 steps." ] }, { "cell_type": "code", "metadata": { "id": "5M-Kt6w8U803", "colab_type": "code", "colab": {} }, "source": [ "_ = linear_regressor.train(\n", " input_fn = lambda:my_input_fn(my_feature, targets),\n", " steps=100\n", ")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "7Nwxqxlx2sOv", "colab_type": "text" }, "source": [ "### Step 6: Evaluate the Model" ] }, { "cell_type": "markdown", "metadata": { "id": "KoDaF2dlJQG5", "colab_type": "text" }, "source": [ "Let's make predictions on that training data, to see how well our model fit it during training.\n", "\n", "**NOTE:** Training error measures how well your model fits the training data, but it **_does not_** measure how well your model **_generalizes to new data_**. In later exercises, you'll explore how to split your data to evaluate your model's ability to generalize.\n" ] }, { "cell_type": "code", "metadata": { "id": "pDIxp6vcU809", "colab_type": "code", "colab": {} }, "source": [ "# Create an input function for predictions.\n", "# Note: Since we're making just one prediction for each example, we don't \n", "# need to repeat or shuffle the data here.\n", "prediction_input_fn =lambda: my_input_fn(my_feature, targets, num_epochs=1, shuffle=False)\n", "\n", "# Call predict() on the linear_regressor to make predictions.\n", "predictions = linear_regressor.predict(input_fn=prediction_input_fn)\n", "\n", "# Format predictions as a NumPy array, so we can calculate error metrics.\n", "predictions = np.array([item['predictions'][0] for item in predictions])\n", "\n", "# Print Mean Squared Error and Root Mean Squared Error.\n", "mean_squared_error = metrics.mean_squared_error(predictions, targets)\n", "root_mean_squared_error = math.sqrt(mean_squared_error)\n", "print(\"Mean Squared Error (on training data): %0.3f\" % mean_squared_error)\n", "print(\"Root Mean Squared Error (on training data): %0.3f\" % root_mean_squared_error)" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "AKWstXXPzOVz", "colab_type": "text" }, "source": [ "Is this a good model? How would you judge how large this error is?\n", "\n", "Mean Squared Error (MSE) can be hard to interpret, so we often look at Root Mean Squared Error (RMSE)\n", "instead. A nice property of RMSE is that it can be interpreted on the same scale as the original targets.\n", "\n", "Let's compare the RMSE to the difference of the min and max of our targets:" ] }, { "cell_type": "code", "metadata": { "id": "7UwqGbbxP53O", "colab_type": "code", "colab": {} }, "source": [ "min_house_value = california_housing_dataframe[\"median_house_value\"].min()\n", "max_house_value = california_housing_dataframe[\"median_house_value\"].max()\n", "min_max_difference = max_house_value - min_house_value\n", "\n", "print(\"Min. Median House Value: %0.3f\" % min_house_value)\n", "print(\"Max. Median House Value: %0.3f\" % max_house_value)\n", "print(\"Difference between Min. and Max.: %0.3f\" % min_max_difference)\n", "print(\"Root Mean Squared Error: %0.3f\" % root_mean_squared_error)" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "JigJr0C7Pzit", "colab_type": "text" }, "source": [ "Our error spans nearly half the range of the target values. Can we do better?\n", "\n", "This is the question that nags at every model developer. Let's develop some basic strategies to reduce model error.\n", "\n", "The first thing we can do is take a look at how well our predictions match our targets, in terms of overall summary statistics." ] }, { "cell_type": "code", "metadata": { "id": "941nclxbzqGH", "colab_type": "code", "cellView": "both", "colab": {} }, "source": [ "calibration_data = pd.DataFrame()\n", "calibration_data[\"predictions\"] = pd.Series(predictions)\n", "calibration_data[\"targets\"] = pd.Series(targets)\n", "calibration_data.describe()" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "E2-bf8Hq36y8", "colab_type": "text" }, "source": [ "Okay, maybe this information is helpful. How does the mean value compare to the model's RMSE? How about the various quantiles?\n", "\n", "We can also visualize the data and the line we've learned. Recall that linear regression on a single feature can be drawn as a line mapping input *x* to output *y*.\n", "\n", "First, we'll get a uniform random sample of the data so we can make a readable scatter plot." ] }, { "cell_type": "code", "metadata": { "id": "SGRIi3mAU81H", "colab_type": "code", "colab": {} }, "source": [ "sample = california_housing_dataframe.sample(n=300)" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "N-JwuJBKU81J", "colab_type": "text" }, "source": [ "Next, we'll plot the line we've learned, drawing from the model's bias term and feature weight, together with the scatter plot. The line will show up red." ] }, { "cell_type": "code", "metadata": { "id": "7G12E76-339G", "colab_type": "code", "cellView": "both", "colab": {} }, "source": [ "# Get the min and max total_rooms values.\n", "x_0 = sample[\"total_rooms\"].min()\n", "x_1 = sample[\"total_rooms\"].max()\n", "\n", "# Retrieve the final weight and bias generated during training.\n", "weight = linear_regressor.get_variable_value('linear/linear_model/total_rooms/weights')[0]\n", "bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')\n", "\n", "# Get the predicted median_house_values for the min and max total_rooms values.\n", "y_0 = weight * x_0 + bias \n", "y_1 = weight * x_1 + bias\n", "\n", "# Plot our regression line from (x_0, y_0) to (x_1, y_1).\n", "plt.plot([x_0, x_1], [y_0, y_1], c='r')\n", "\n", "# Label the graph axes.\n", "plt.ylabel(\"median_house_value\")\n", "plt.xlabel(\"total_rooms\")\n", "\n", "# Plot a scatter plot from our data sample.\n", "plt.scatter(sample[\"total_rooms\"], sample[\"median_house_value\"])\n", "\n", "# Display graph.\n", "plt.show()" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "t0lRt4USU81L", "colab_type": "text" }, "source": [ "This initial line looks way off. See if you can look back at the summary stats and see the same information encoded there.\n", "\n", "Together, these initial sanity checks suggest we may be able to find a much better line." ] }, { "cell_type": "markdown", "metadata": { "id": "AZWF67uv0HTG", "colab_type": "text" }, "source": [ "## Tweak the Model Hyperparameters\n", "For this exercise, we've put all the above code in a single function for convenience. You can call the function with different parameters to see the effect.\n", "\n", "In this function, we'll proceed in 10 evenly divided periods so that we can observe the model improvement at each period.\n", "\n", "For each period, we'll compute and graph training loss. This may help you judge when a model is converged, or if it needs more iterations.\n", "\n", "We'll also plot the feature weight and bias term values learned by the model over time. This is another way to see how things converge." ] }, { "cell_type": "code", "metadata": { "id": "wgSMeD5UU81N", "colab_type": "code", "colab": {} }, "source": [ "def train_model(learning_rate, steps, batch_size, input_feature=\"total_rooms\"):\n", " \"\"\"Trains a linear regression model of one feature.\n", " \n", " Args:\n", " learning_rate: A `float`, the learning rate.\n", " steps: A non-zero `int`, the total number of training steps. A training step\n", " consists of a forward and backward pass using a single batch.\n", " batch_size: A non-zero `int`, the batch size.\n", " input_feature: A `string` specifying a column from `california_housing_dataframe`\n", " to use as input feature.\n", " \"\"\"\n", " \n", " periods = 10\n", " steps_per_period = steps / periods\n", "\n", " my_feature = input_feature\n", " my_feature_data = california_housing_dataframe[[my_feature]]\n", " my_label = \"median_house_value\"\n", " targets = california_housing_dataframe[my_label]\n", "\n", " # Create feature columns.\n", " feature_columns = [tf.feature_column.numeric_column(my_feature)]\n", " \n", " # Create input functions.\n", " training_input_fn = lambda:my_input_fn(my_feature_data, targets, batch_size=batch_size)\n", " prediction_input_fn = lambda: my_input_fn(my_feature_data, targets, num_epochs=1, shuffle=False)\n", " \n", " # Create a linear regressor object.\n", " my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\n", " my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n", " linear_regressor = tf.estimator.LinearRegressor(\n", " feature_columns=feature_columns,\n", " optimizer=my_optimizer\n", " )\n", "\n", " # Set up to plot the state of our model's line each period.\n", " plt.figure(figsize=(15, 6))\n", " plt.subplot(1, 2, 1)\n", " plt.title(\"Learned Line by Period\")\n", " plt.ylabel(my_label)\n", " plt.xlabel(my_feature)\n", " sample = california_housing_dataframe.sample(n=300)\n", " plt.scatter(sample[my_feature], sample[my_label])\n", " colors = [cm.coolwarm(x) for x in np.linspace(-1, 1, periods)]\n", "\n", " # Train the model, but do so inside a loop so that we can periodically assess\n", " # loss metrics.\n", " print(\"Training model...\")\n", " print(\"RMSE (on training data):\")\n", " root_mean_squared_errors = []\n", " for period in range (0, periods):\n", " # Train the model, starting from the prior state.\n", " linear_regressor.train(\n", " input_fn=training_input_fn,\n", " steps=steps_per_period\n", " )\n", " # Take a break and compute predictions.\n", " predictions = linear_regressor.predict(input_fn=prediction_input_fn)\n", " predictions = np.array([item['predictions'][0] for item in predictions])\n", " \n", " # Compute loss.\n", " root_mean_squared_error = math.sqrt(\n", " metrics.mean_squared_error(predictions, targets))\n", " # Occasionally print the current loss.\n", " print(\" period %02d : %0.2f\" % (period, root_mean_squared_error))\n", " # Add the loss metrics from this period to our list.\n", " root_mean_squared_errors.append(root_mean_squared_error)\n", " # Finally, track the weights and biases over time.\n", " # Apply some math to ensure that the data and line are plotted neatly.\n", " y_extents = np.array([0, sample[my_label].max()])\n", " \n", " weight = linear_regressor.get_variable_value('linear/linear_model/%s/weights' % input_feature)[0]\n", " bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')\n", "\n", " x_extents = (y_extents - bias) / weight\n", " x_extents = np.maximum(np.minimum(x_extents,\n", " sample[my_feature].max()),\n", " sample[my_feature].min())\n", " y_extents = weight * x_extents + bias\n", " plt.plot(x_extents, y_extents, color=colors[period]) \n", " print(\"Model training finished.\")\n", "\n", " # Output a graph of loss metrics over periods.\n", " plt.subplot(1, 2, 2)\n", " plt.ylabel('RMSE')\n", " plt.xlabel('Periods')\n", " plt.title(\"Root Mean Squared Error vs. Periods\")\n", " plt.tight_layout()\n", " plt.plot(root_mean_squared_errors)\n", "\n", " # Output a table with calibration data.\n", " calibration_data = pd.DataFrame()\n", " calibration_data[\"predictions\"] = pd.Series(predictions)\n", " calibration_data[\"targets\"] = pd.Series(targets)\n", " display.display(calibration_data.describe())\n", "\n", " print(\"Final RMSE (on training data): %0.2f\" % root_mean_squared_error)" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "kg8A4ArBU81Q", "colab_type": "text" }, "source": [ "## Task 1: Achieve an RMSE of 180 or Below\n", "\n", "Tweak the model hyperparameters to improve loss and better match the target distribution.\n", "If, after 5 minutes or so, you're having trouble beating a RMSE of 180, check the solution for a possible combination." ] }, { "cell_type": "code", "metadata": { "id": "UzoZUSdLIolF", "colab_type": "code", "cellView": "both", "colab": {} }, "source": [ "train_model(\n", " learning_rate=0.00001,\n", " steps=100,\n", " batch_size=1\n", ")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "ajVM7rkoYXeL", "colab_type": "text" }, "source": [ "### Solution\n", "\n", "Click below for one possible solution." ] }, { "cell_type": "code", "metadata": { "id": "T3zmldDwYy5c", "colab_type": "code", "colab": {} }, "source": [ "train_model(\n", " learning_rate=0.00002,\n", " steps=500,\n", " batch_size=5\n", ")" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "M8H0_D4vYa49", "colab_type": "text" }, "source": [ "This is just one possible configuration; there may be other combinations of settings that also give good results. Note that in general, this exercise isn't about finding the *one best* setting, but to help build your intutions about how tweaking the model configuration affects prediction quality." ] }, { "cell_type": "markdown", "metadata": { "id": "QU5sLyYTqzqL", "colab_type": "text" }, "source": [ "### Is There a Standard Heuristic for Model Tuning?\n", "\n", "This is a commonly asked question. The short answer is that the effects of different hyperparameters are data dependent. So there are no hard-and-fast rules; you'll need to test on your data.\n", "\n", "That said, here are a few rules of thumb that may help guide you:\n", "\n", " * Training error should steadily decrease, steeply at first, and should eventually plateau as training converges.\n", " * If the training has not converged, try running it for longer.\n", " * If the training error decreases too slowly, increasing the learning rate may help it decrease faster.\n", " * But sometimes the exact opposite may happen if the learning rate is too high.\n", " * If the training error varies wildly, try decreasing the learning rate.\n", " * Lower learning rate plus larger number of steps or larger batch size is often a good combination.\n", " * Very small batch sizes can also cause instability. First try larger values like 100 or 1000, and decrease until you see degradation.\n", "\n", "Again, never go strictly by these rules of thumb, because the effects are data dependent. Always experiment and verify." ] }, { "cell_type": "markdown", "metadata": { "id": "GpV-uF_cBCBU", "colab_type": "text" }, "source": [ "## Task 2: Try a Different Feature\n", "\n", "See if you can do any better by replacing the `total_rooms` feature with the `population` feature.\n", "\n", "Don't take more than 5 minutes on this portion." ] }, { "cell_type": "code", "metadata": { "id": "YMyOxzb0ZlAH", "colab_type": "code", "colab": {} }, "source": [ "# YOUR CODE HERE" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "ci1ISxxrZ7v0", "colab_type": "text" }, "source": [ "### Solution\n", "\n", "Click below for one possible solution." ] }, { "cell_type": "code", "metadata": { "id": "SjdQQCduZ7BV", "colab_type": "code", "colab": {} }, "source": [ "train_model(\n", " learning_rate=0.00002,\n", " steps=1000,\n", " batch_size=5,\n", " input_feature=\"population\"\n", ")" ], "execution_count": 0, "outputs": [] } ] }