{ "metadata": { "name": "softmax_regression" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# pylearn2 tutorial: Softmax regression\n", "by [Ian Goodfellow](http://www-etud.iro.umontreal.ca/~goodfeli)\n", "\n", "## Introduction\n", "This ipython notebook will teach you the basics of how softmax regression works, and show you how to do softmax regression in pylearn2.\n", "\n", "To do this, we will go over several concepts:\n", "\n", "Part 1: What pylearn2 is doing for you in this example\n", "\n", " - What softmax regression is, and the math of how it works\n", "\n", " - The basic theory of how softmax regression training works\n", "\n", "Part 2: How to use pylearn2 to do softmax regression\n", "\n", " - How to load data in pylearn2, and specifically how to load the MNIST dataset\n", "\n", " - How to configure the pylearn2 SoftmaxRegression model\n", "\n", " - How to set up a pylearn2 training algorithm\n", "\n", " - How to run training with the pylearn2 train script, and interpret its output\n", "\n", " - How to analyze the results of training\n", "\n", "\n", "Note that this won't explain in detail how the individual classes are implemented. The classes\n", "follow pretty good naming conventions and have pretty good docstrings, but if you have trouble\n", "understanding them, write to me and I might add a part 3 explaining how some of the parts work\n", "under the hood.\n", "\n", "Please write to pylearn-dev@googlegroups.com if you encounter any problem with this tutorial." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Requirements\n", "\n", "Before running this notebook, you must have installed pylearn2.\n", "Follow the [download and installation instructions](http://deeplearning.net/software/pylearn2/#download-and-installation)\n", "if you have not yet done so." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part 1: What pylearn2 is doing for you in this example\n", "\n", "In this part, we won't get into any specifics of pylearn2 yet. We'll just discuss how to train a softmax regression model. If you already know about softmax regression, feel free to skip straight to part 2, where we show how to do all of this in pylearn2.\n", "\n", "### What softmax regression is, and the math of how it works\n", "\n", "Softmax regression is type of classification model (so the \"regression\" in the name is really a misnomer),\n", "which means it is a pattern recognition algorithm that maps input patterns to categories. In this tutorial,\n", "the input patterns will be images of handwritten digits, and the output category will be the identity of the\n", "digit (0-9). In other words, we will use softmax regression to solve a simple optical character recognition\n", "problem.\n", "\n", "You may have heard of [logistic regression](http://en.wikipedia.org/wiki/Logistic_regression). Logistic\n", "regression is a special case of softmax regression. Specifically, it is the case where there are only two\n", "possible output categories. Softmax regression is a generalization of logistic regression to multiple categories.\n", "\n", "Let's define some basic terms. First, we'll use the variable $x$ to represent the input to the softmax regression\n", "model. We'll use the variable $y$ to represent the output category. Let $y$ be a non-negative integer, such that\n", "$0 \\leq y < k$ , where $k$ is the number of categories $x$ may belong to. In our example, we are classifying\n", "handwritten digits ranging in value from 0 to 9, so the value of y is very easy to interpret. When $y = 7$, the category\n", "identified is 7. In most applications, we interpret $y$ as being a numeric code identifying a category, e.g., 0 = cat, 1 = dog, 2 = airplane, etc.\n", "\n", "The job of the softmax regression classifier is to predict the probability of $x$ belonging to each class. i.e, we want\n", "to be able to compute $p(y = i \\mid x)$ for all $k$ possible values of $i$.\n", "\n", "The role of a parametric model like softmax regression is to define a set of parameters and describe how they map to functions $f$ defining $p(y \\mid x)$. In the case of softmax regression, the model assumes that the log probability of $y=i$ is an affine function of the input $x$, up to some constant $c(x)$. $c(x)$ is defined to be whatever constant is needed to make the distribution add up to 1.\n", "\n", "To make this more formal, let $p(y)$ be written as a vector $[ p(y=0), p(y=1), \\dots, p(y=k-1) ]^T$. Assume that $x$ can be represented as a vector of numbers (in this example, we will regard each pixel of an grayscale image as being represented by\n", "a number in [0,1], and we will turn the 2D array of the image into a vector by using numpy's reshape method).\n", "Then the assumption\n", "that softmax regression makes is that\n", "\n", "$$\\log p(y \\mid x) = x^T W + b + c(x) $$\n", "\n", "where $W$ is a matrix and $b$ is a vector. Note that $c(x)$ is just a scalar but here I am adding it to a vector.\n", "I'm using numpy broadcasting rules in my math here, so this means to add $c(x)$ to every element of the vector.\n", "I'll use numpy broadcasting rules throughout this tutorial.\n", "\n", "$W$ and $b$ are the parameters of the model, and determine how inputs are mapped to output categories. We usually call $W$ the \"weights\" and $b$ the \"biases.\"\n", "\n", "By doing some algebra, using the constraint that $p(y)$ must add up to 1, we get\n", "\n", "$$ p(y \\mid x) = \\frac { \\exp( x^T W + b ) } { \\sum_i \\exp(x^T W + b)_i } = \\text{softmax}( x^T W + b) $$\n", "\n", "where $\\text{softmax}$ is the [softmax activation function](http://en.wikipedia.org/wiki/Softmax_activation_function).\n", "\n", "## The basic theory of how softmax regression training works\n", "\n", "Of course, the softmax model will only assign $x$ to the right category if its parameters have been adjusted to make them specify the right mapping. To do this we need to train the model.\n", "\n", "The basic idea is that we have a collection of training examples, $\\mathcal{D}$. Each example is an (x, y) tuple. We will fit\n", "the model to the training set, so that when run on the training data, it outputs a good estimate of the probability distribution\n", "over $y$ for all of the $x$s. \n", "\n", "One way to fit the model is [maximum likelihood estimation](http://en.wikipedia.org/wiki/Maximum_likelihood). Suppose we draw a category variable $\\hat{y}$ from our model's distribution $p(y \\mid x)$ for every training example independently. We want to\n", "maximize the probability of all of those labels being correct. To do this, we maximize the function\n", "\n", "$$ J( \\mathcal{D}, W, b) = \\Pi_{x,y \\in \\mathcal{D} } p(y \\mid x ). $$\n", "\n", "That function involves lots of multiplication, of possibly very small numbers (note that the softmax activation function guarantees none of them will ever be exactly zero). Multiplying together many small numbers can result in numerical underflow.\n", "In practice, we usually take the logarithm of this function to avoid underflow. Since the logarithm is a monotically increasing\n", "function, it doesn't change which parameter value is optimal. It does get rid of the multiplication though:\n", "\n", "$$ J( \\mathcal{D}, W, b) = \\sum_{x,y \\in \\mathcal{D} } \\log p(y \\mid x ). $$\n", "\n", "Many different algorithms can maximize $J$. In this tutorial, we will use an algorithm called [nonlinear conjugate gradient descent](http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient_method) to minimize $-J$. In the case of softmax regression, maximizing $J$ is a [convex optimization problem](http://en.wikipedia.org/wiki/Convex_optimization) so any optimization algorithm\n", "should find the same solution. The choice of nonlinear conjugate gradient is mostly to demonstrate that feature of pylearn2.\n", "\n", "One problem with maximium likelihood estimation is that it can suffer from a problem called [overfitting](http://en.wikipedia.org/wiki/Overfitting). The basic intuition is that the model can memorize patterns in the training set that are specific to the training examples, i.e. patterns that are spurious and not indicative of the correct way to categorize new, previously unseen inputs. One way to prevent this is to use [early stopping](http://en.wikipedia.org/wiki/Early_stopping).\n", "Most optimization methods are iterative, in that they try out several values of $W$ and $b$ gradually looking for the best one.\n", "Early stopping refers to stopping this search before finding the absolute best values on the training set. If we start with\n", "$W$ close to the origin, then stopping early means that $W$ will not travel as far from the origin as it would if we ran\n", "the optimization procedure to completion. Early stopping corresponds to assuming that the correlations between input features and output categories are not as strong as pure maximum likelihood estimation would determine them to be.\n", "\n", "In order to pick the right point in time to stop, we divide the training set into two subsets: one that we will actually train on,\n", "and one that we use to see how well the model is generalizing to new data, then \"validation set.\" The idea is to return the model\n", "that does the best at classifying the validation set, rather than the model that assigns the highest probability to the training\n", "set." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#Part 2: How to use pylearn2 to do softmax regression\n", "\n", "Now that we've described the theory of what we're going to do, it's time to do it! This part describes\n", "how to use pylearn2 to run the algorithms described above.\n", "\n", "##How to load data in pylearn2, and specifically how to load the MNIST dataset\n", "\n", "To train a model in pylearn2, we need to construct several objects specifying how to train it. There are two ways to do this.\n", "One is to explicitly construct them as python objects. The other is to specify them using [YAML](http://www.yaml.org/) strings. The latter option is\n", "better supported at present, so we will use that.\n", "\n", "In this ipython notebook, we will construct YAML strings in python. Most of the time when I use pylearn2, I write the yaml\n", "string out on disk, then run pylearn2's train.py script on that YAML file. In the format of this tutorial, in an ipython notebook,\n", "it's easier to just do everything in python though.\n", "\n", "YAML allows the definition of third-party tags that specify how the YAML string should be deserialized, and pylearn2 has a\n", "few of those. One of them is the !obj tag, which specifies that what follows is a full specification of a python callable\n", "that returns an object. Usually this will just be a class name.\n", "\n", "In this tutorial, we will train our model on the [MNIST](http://yann.lecun.com/exdb/mnist) dataset.\n", "In order to load that, we use an !obj tag to construct an instance of pylearn2's MNIST class, found\n", "in the pylearn2.datasets.mnist python module.\n", "\n", "We can pass arguments to the MNIST class's __init__ method by defining a dictionary mapping argument names\n", "to their values.\n", "\n", "The MNIST dataset is split into a training set and a test set. Since the object we are constructing now\n", "will be used as the training set, we must specify that we want to load the training data. We can use the\n", "'which_set' argument to do this.\n", "\n", "Finally, as described above, we will use early stopping, so we shouldn't train on the entire training set.\n", "The MNIST training set contains 60,000 examples. We use the 'start' and 'stop' arguments to train on the first\n", "50,000 of them." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "import pylearn2\n", "dirname = os.path.abspath(os.path.dirname('softmax_regression.ipynb'))\n", "with open(os.path.join(dirname, 'sr_dataset.yaml'), 'r') as f:\n", " dataset = f.read()\n", "hyper_params = {'train_stop' : 50000}\n", "dataset = dataset % (hyper_params)\n", "print dataset\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "!obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'train',\n", " start: 0,\n", " stop: 50000\n", "}\n", "\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##How to configure the pylearn2 SoftmaxRegression model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we need to specify an object representing the model to be trained. To do this, we need to make an instance of the\n", "SoftmaxRegression class defined in pylearn2.models.softmax_regression. We need to specify a few details of how to configure the model.\n", "\n", "The \"nvis\" argument stands for \"number of visible units.\" In neural network terminology, the \"visible units\" are the pieces of data that the model gets to observe. This argument is asking for the dimension of $x$. If we didn't want $x$ to be a vector, there is another more flexible way of configuring the input of the model, but for vector-based models, \"nvis\" is the easiest piece of the API to use. The MNIST dataset contains 28x28 grayscale images, not vectors, so the SoftmaxRegression model will ask pylearn2 to flatten the images into vectors. That means it will receive a vector with 28\\*28=784 elements.\n", "\n", "We also need to specify how many categories or classes there are with the \"n_classes\" argument.\n", "\n", "Finally, the matrix $W$ will be randomly initialized. There are a few different initialization schemes in pylearn2. Specifying the \"irange\" argument will make each element of $W$ be initialized from $U(-\\text{irange}, \\text{irange})$. Since softmax regression training is a convex optimization problem, we can set irange to 0 to initialize all of $W$ to 0. (Some other models require that the different columns of $W$ differ from each other initially in order for them to train correctly)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "import pylearn2\n", "dirname = os.path.abspath(os.path.dirname('softmax_regression.ipynb'))\n", "with open(os.path.join(dirname, 'sr_model.yaml'), 'r') as f:\n", " model = f.read()\n", "\n", "print model" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "!obj:pylearn2.models.softmax_regression.SoftmaxRegression {\n", " n_classes: 10,\n", " irange: 0.,\n", " nvis: 784,\n", "}\n", "\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##How to set up a pylearn2 training algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we need to specify a training algorithm to maximize the log likelihood with. (Actually, we will minimize the negative log likelihood, because all of pylearn2's optimization algorithms are written in terms of minimizing a cost function. theano will optimize out any double-negation that results, so this has no effect on the runtime of the algorithm)\n", "\n", "We can use an !obj tag to load pylearn2's BGD class. BGD stands for batch gradient descent. It is a class designed to train models by moving in the direction of the gradient of the objective function applied to large batches of examples.\n", "\n", "The \"batch_size\" argument determines how many examples the BGD class will act on at one time. This should be a fairly large number so that the updates are more likely to generalize to other batches.\n", "\n", "Setting \"line_search_mode\" to exhaustive means that the BGD class will try to binary search for the best possible point along the direction of the gradient of the cost function, rather than just trying out a few pre-selected step sizes. This implements the method of steepest descent.\n", "\n", "\"conjugate\" is a boolean flag. By setting it to 1, we make BGD modify the gradient directions to preserve conjugacy prior to doing the line search. This implements nonlinear conjugate gradient descent.\n", "\n", "During training, we will keep track of several different quantities of interest to the experimenter, such as the number of examples that are classified correctly, the objective function value, etc. The quantities to track are determined by the model class and by the training algorithm class. These quantities are referred to as \"channels\" and the act of tracking them is called \"monitoring\" in pylearn2 terms. In order to track them, we need to specify a monitoring dataset. In this case, we use a dictionary to make multiple, named monitoring datasets.\n", "\n", "We use \"\\*train\" to define the training set. The \\* is YAML syntax saying to reference an object defined elsewhere in the YAML file. Later, when we specify which dataset to train on, we will define this reference.\n", "\n", "Finally, the BGD algorithm needs to know when to stop training. We therefore give it a \"termination criterion.\" In this case, we use a monitor-based termination criterion that says to stop when too little progress is being made at reducing the value tracked by one of the monitoring channels. In this case, we use \"valid_y_misclass\", which is the rate at which the model mislabels examples on the validation set. MonitorBased has some other arguments that we don't bother to specify here, and just use the defaults. These defaults will result in the training algorithm running for a while after the lowest value of the validation error has been reached, to make sure that we don't stop too soon just because the validation error randomly bounced upward for a few epochs.\n", "\n", "You might expect the BGD algorithm to need to be told what objective function to minimize. It turns out that if the user doesn't say what objective function to minimize, BGD will ask the model for some default objective function, by calling the models \"get_default_cost\" method. In this case, the SoftmaxRegression model provides the negative log likelihood as the default objective function." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "import pylearn2\n", "dirname = os.path.abspath(os.path.dirname('softmax_regression.ipynb'))\n", "with open(os.path.join(dirname, 'sr_algorithm.yaml'), 'r') as f:\n", " algorithm = f.read()\n", "hyper_params = {'batch_size' : 10000,\n", " 'valid_stop' : 60000}\n", "algorithm = algorithm % (hyper_params)\n", "print algorithm\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "!obj:pylearn2.training_algorithms.bgd.BGD {\n", " batch_size: 10000,\n", " line_search_mode: 'exhaustive',\n", " conjugate: 1,\n", " monitoring_dataset:\n", " {\n", " 'train' : *train,\n", " 'valid' : !obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'train',\n", " start: 50000,\n", " stop: 60000\n", " },\n", " 'test' : !obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'test',\n", " }\n", " },\n", " termination_criterion: !obj:pylearn2.termination_criteria.MonitorBased {\n", " channel_name: \"valid_y_misclass\"\n", " }\n", " }\n", "\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##How to run training with the pylearn2 train script, and interpret its output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now use a pylearn2 Train object to represent the training problem.\n", "\n", "We use \"&train\" here to define the reference used with the \"\\*train\" line in the algorithm section.\n", "\n", "We use the python %(varname)s syntax and the locals() dictionary to paste the dataset, model, and algorithm strings from the earlier sections into this final string here.\n", "\n", "As specified in the previous section, the model will keep training for a while after the lowest validation error is reached, just to make sure that it won't start going down again. However, the final model we would like to return is the one with the lowest validation error. We add an \"extension\" to the training algorithm here. Extensions are objects with callbacks that get triggered at different points in time, such as the end of a training epoch. In this case, we use the MonitorBasedSaveBest extension. Whenever the monitoring channels are updated, MonitorBasedSaveBest will check if a specific channel decreased, and if so, it will save a copy of the model. This way, the best model is saved at the end. Here we save the model with the lowest validation set error to \"softmax_regression_best.pkl.\"" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "import pylearn2\n", "dirname = os.path.abspath(os.path.dirname('softmax_regression.ipynb'))\n", "with open(os.path.join(dirname, 'sr_train.yaml'), 'r') as f:\n", " train = f.read()\n", "save_path = '.'\n", "train = train %locals()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Execute the cell below to see the final YAML string." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print train" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "!obj:pylearn2.train.Train {\n", " dataset: &train !obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'train',\n", " start: 0,\n", " stop: 50000\n", "}\n", ",\n", " model: !obj:pylearn2.models.softmax_regression.SoftmaxRegression {\n", " n_classes: 10,\n", " irange: 0.,\n", " nvis: 784,\n", "}\n", ",\n", " algorithm: !obj:pylearn2.training_algorithms.bgd.BGD {\n", " batch_size: 10000,\n", " line_search_mode: 'exhaustive',\n", " conjugate: 1,\n", " monitoring_dataset:\n", " {\n", " 'train' : *train,\n", " 'valid' : !obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'train',\n", " start: 50000,\n", " stop: 60000\n", " },\n", " 'test' : !obj:pylearn2.datasets.mnist.MNIST {\n", " which_set: 'test',\n", " }\n", " },\n", " termination_criterion: !obj:pylearn2.termination_criteria.MonitorBased {\n", " channel_name: \"valid_y_misclass\"\n", " }\n", " }\n", ",\n", " extensions: [\n", " !obj:pylearn2.train_extensions.best_params.MonitorBasedSaveBest {\n", " channel_name: 'valid_y_misclass',\n", " save_path: \"softmax_regression_best.pkl\"\n", " },\n", " ],\n", " save_path: \"softmax_regression.pkl\",\n", " save_freq: 1\n", "}\n", "\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we use pylearn2's yaml_parse.load to construct the Train object, and run its main loop. The same thing could be accomplished by running pylearn2's train.py script on a file containing the yaml string.\n", "\n", "Execute the next cell to train the model. This will take a few minutes, and it will print out output periodically as it runs." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from pylearn2.config import yaml_parse\n", "train = yaml_parse.load(train)\n", "train.main_loop()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "compiling begin_record_entry...\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "/u/almahaia/Code/pylearn2/pylearn2/models/mlp.py:40: UserWarning: MLP changing the recursion limit.\n", " warnings.warn(\"MLP changing the recursion limit.\")\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "compiling begin_record_entry done. Time elapsed: 0.127929 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitored channels: \n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Compiling accum...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "graph size: 58\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "graph size: 53\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "graph size: 53\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Compiling accum done. Time elapsed: 1.825620 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.902\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.90136\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.9009\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 2.30258509299\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 47.135716 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 1\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 5\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 50000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.55542355706\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.694843087116\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 1.82795330924\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.301359300793\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 3.23311685335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 2.91097673718\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 2.20925662298\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.99999504546\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.883456583251\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.18919041972\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0824\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.301359300793\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 0.894549596168\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.245640441388\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.312732697075\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 3.23311685335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 2.91097673718\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 2.20925662298\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999997104388\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.878126747054\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.210295235229\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.08648\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.312732697075\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 0.894549596168\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.245640441388\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 47.135716\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.294293650438\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 3.23311685335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 2.91097673718\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 2.20925662298\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999998686662\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.885458000598\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.175666181209\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0807\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.294293650438\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 0.894549596168\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.245640441388\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.037422 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.883598 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 2\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 10\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 100000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.56596440313\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.441610700411\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 1.16066396621\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.285237258524\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 3.91262647813\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 3.46406578099\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 2.63731792036\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999998908541\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.89510200425\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.172724479158\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0786\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.285237258524\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.04003162633\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.300680907131\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 48.107276\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.289143688973\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 3.91262647813\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 3.46406578099\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 2.63731792036\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999368949\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.890736819369\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.224060606814\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.08084\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.289143688973\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.04003162633\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.300680907131\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.883598\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.276589904503\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 3.91262647813\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 3.46406578099\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 2.63731792036\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999998435824\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.897311954342\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.225660718987\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0775\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.276589904503\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.04003162633\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.300680907131\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.032445 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.469979 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 3\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 15\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 150000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.64427660828\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.28695642402\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.756031211218\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.280055491744\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 4.38959255973\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 3.85022961032\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 3.00824662805\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999413457\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.900924742822\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.232922344039\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0779\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.280055491744\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.12073028284\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.339333001502\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.867692\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.278605710329\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 4.38959255973\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 3.85022961032\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 3.00824662805\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999704668\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.896695616738\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.225369612588\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.0778\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.278605710329\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.12073028284\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.339333001502\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.469979\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.272806812447\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 4.38959255973\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 3.85022961032\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 3.00824662805\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999998390007\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.902116310016\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.222342784632\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0758\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.272806812447\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.12073028284\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.339333001502\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.034981 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.332214 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 4\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 20\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 200000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.7328310823\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.192327609673\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.510358148581\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.27844626375\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 4.70789506799\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 4.16072458576\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 3.2649938495\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999857874\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.904837677386\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.235765614223\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0784\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.27844626375\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.18924856325\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.370615415845\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.427461\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.271860810833\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 4.70789506799\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 4.16072458576\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 3.2649938495\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999937335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.900762911307\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.215634585856\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07636\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.271860810833\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.18924856325\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.370615415845\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.332214\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.26726459723\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 4.70789506799\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 4.16072458576\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 3.2649938495\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.99999968428\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.906190087062\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.242091672253\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0746\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.26726459723\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.18924856325\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.370615415845\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.032767 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.096713 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 5\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 25\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 250000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.79555296073\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.135665128631\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.362963828727\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.273817364497\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 5.03140113083\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 4.43736580535\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 3.4996553347\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999924695\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.908420561625\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.20105158513\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0784\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.273817364497\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.27503247719\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.398188014557\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.265087\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.266190019836\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 5.03140113083\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 4.43736580535\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 3.4996553347\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999949637\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.904266352694\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.214983817154\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.0747\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.266190019836\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.27503247719\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.398188014557\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.096713\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.262773057665\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 5.03140113083\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 4.43736580535\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 3.4996553347\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999834998\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.90977309107\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.227467467432\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0734\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.262773057665\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.27503247719\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.398188014557\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.040086 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.270360 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 6\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 30\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 300000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.86472520335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0995817704689\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.27145607381\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.274650362478\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 5.29667864648\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 4.6681660556\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 3.69537437025\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999953454\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.909515981021\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.240807995221\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0762\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.274650362478\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.34757538106\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.421872641122\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.063998\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.263465024685\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 5.29667864648\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 4.6681660556\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 3.69537437025\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999967452\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.904810346072\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.222843798769\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07312\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.263465024685\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.34757538106\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.421872641122\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.27036\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.264160131695\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 5.29667864648\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 4.6681660556\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 3.69537437025\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999944173\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.910249991543\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.230041435408\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0738\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.264160131695\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.34757538106\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.421872641122\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.343817 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.146142 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 7\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 35\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 350000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.94614410339\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0791918096972\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.22197684405\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.27202668856\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 5.53374863965\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 4.89227673333\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 3.8870453917\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999976028\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.91265320303\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.224122342798\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0774\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.27202668856\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.41189262807\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.444192206987\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.509842\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.260971194008\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 5.53374863965\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 4.89227673333\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 3.8870453917\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999976047\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.908725530838\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.232349314658\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.0732\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.260971194008\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.41189262807\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.444192206987\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.146142\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.26436051024\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 5.53374863965\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 4.89227673333\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 3.8870453917\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999949706\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.912402441241\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.22991605073\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0738\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.26436051024\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.41189262807\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.444192206987\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 1.046755 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.541562 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 8\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 40\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 400000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 2.9589170095\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0661130527395\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.187367468335\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.270976679584\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 5.75592674885\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.08230762173\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.02942401417\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999960476\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.911246352209\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.202601016211\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0765\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.270976679584\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.5186928872\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.46350948492\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 50.092236\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.256781505833\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 5.75592674885\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.08230762173\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.02942401417\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.99999996249\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.907843630275\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.227267591038\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07108\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.256781505833\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.5186928872\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.46350948492\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.541562\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.261108444735\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 5.75592674885\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.08230762173\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.02942401417\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999906762\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.912796132628\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.240817912865\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0717\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.261108444735\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.5186928872\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.46350948492\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.074899 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.921041 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 9\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 45\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 450000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.07640978739\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0578164084249\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.168970324213\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.269887997532\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 5.97236412318\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.28605773752\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.20493453263\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999964196\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.914368745924\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.232516262448\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0759\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.269887997532\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.61850785481\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.483502165396\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 52.954261\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.255329041014\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 5.97236412318\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.28605773752\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.20493453263\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999968304\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.911166781743\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.233844764224\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07102\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.255329041014\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.61850785481\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.483502165396\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.921041\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.260598210855\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 5.97236412318\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.28605773752\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.20493453263\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.99999995121\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.9160507243\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.247938293814\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0707\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.260598210855\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.61850785481\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.483502165396\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.045346 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 47.756314 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 10\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 50\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 500000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.17751170721\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0527696885026\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.161604222181\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.272032441361\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 6.15020937219\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.45818015281\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.32908868031\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999982392\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.912849967862\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.243103761542\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0766\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.272032441361\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.70016734551\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.500912519066\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 50.532318\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.253810005663\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 6.15020937219\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.45818015281\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.32908868031\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999988222\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.909491011432\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.245272533783\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07128\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.253810005663\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.70016734551\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.500912519066\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 47.756314\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.262035188475\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 6.15020937219\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.45818015281\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.32908868031\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999987956\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.913773812077\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.257571659974\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0732\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.262035188475\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.70016734551\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.500912519066\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.078452 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.350366 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 11\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 55\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 550000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.20139199162\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0497989460367\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.15526891131\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.269176335486\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 6.35982269721\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.62561655342\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.51263517136\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999982551\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.913888237951\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.211160862124\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0769\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.269176335486\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.77393186259\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.517690625322\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 48.733069\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.251423795062\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 6.35982269721\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.62561655342\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.51263517136\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999981008\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.910697294132\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.22384883143\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07048\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.251423795062\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.77393186259\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.517690625322\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.350366\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.260297250861\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 6.35982269721\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.62561655342\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.51263517136\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999976584\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.914831005565\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.253224141585\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0707\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.260297250861\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.77393186259\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.517690625322\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.287064 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.337238 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 12\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 60\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 600000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.18421082504\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0480066437265\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.149554335764\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.268167696714\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 6.53694805673\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.78651338071\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.62123127003\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999994824\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.916031399063\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.253071105052\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0745\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.268167696714\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.82371102711\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.533708300513\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.521323\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.250407471096\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 6.53694805673\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.78651338071\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.62123127003\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999992194\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.912168287933\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.234702556895\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.07012\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.250407471096\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.82371102711\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.533708300513\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.337238\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.261415798892\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 6.53694805673\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.78651338071\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.62123127003\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999982776\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.916843313029\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.236225524738\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0721\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.261415798892\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.82371102711\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.533708300513\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.076885 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.408465 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 13\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 65\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 650000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.32896781367\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0465828940367\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.152447504409\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.273306774992\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 6.73632193798\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 5.95297548891\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.7863381339\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999991705\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.916068767119\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.228716432447\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0765\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.273306774992\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.908978446\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.550136388278\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 49.294602\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.250124890774\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 6.73632193798\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 5.95297548891\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.7863381339\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.99999999497\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.912162787419\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.242706558496\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.06994\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.250124890774\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.908978446\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.550136388278\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.408465\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.264447621587\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 6.73632193798\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 5.95297548891\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.7863381339\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999995746\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.916910058012\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.232430962179\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0726\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.264447621587\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.908978446\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.550136388278\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 1.955159 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Time this epoch: 48.118448 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Monitoring step:\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tEpochs seen: 14\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tBatches seen: 70\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tExamples seen: 700000\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_mult: 3.3627779367\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_grad_size: 0.0461675912642\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tave_step_size: 0.152041664159\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_objective: 0.271266060728\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_max: 6.9214493801\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_mean: 6.10749929728\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_col_norms_min: 4.91335986129\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_max_max_class: 0.999999995112\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_mean_max_class: 0.91712864321\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_min_max_class: 0.240703665988\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_misclass: 0.0766\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_nll: 0.271266060728\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_max: 1.96655587113\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_mean: 0.565461485822\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttest_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttotal_seconds_last_epoch: 51.252898\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_objective: 0.247700760285\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_max: 6.9214493801\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_mean: 6.10749929728\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_col_norms_min: 4.91335986129\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_max_max_class: 0.999999996171\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_mean_max_class: 0.913135796548\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_min_max_class: 0.237545493213\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_misclass: 0.0687\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_nll: 0.247700760285\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_max: 1.96655587113\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_mean: 0.565461485822\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttrain_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\ttraining_seconds_this_epoch: 48.118448\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_objective: 0.261790115276\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_max: 6.9214493801\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_mean: 6.10749929728\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_col_norms_min: 4.91335986129\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_max_max_class: 0.999999994777\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_mean_max_class: 0.917263145852\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_min_max_class: 0.238750828718\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_misclass: 0.0718\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_nll: 0.261790115276\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_max: 1.96655587113\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_mean: 0.565461485822\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\tvalid_y_row_norms_min: 0.0\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.047610 seconds\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl...\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Saving to softmax_regression.pkl done. Time elapsed: 0.072144 seconds\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "As the model trained, it should have printed out progress messages. Most of these are the values of the various channels being monitored throughout training." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##How to analyze the results of training" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the print_monitor script to print the last monitoring entry of a saved model. By running it on \"softmax_regression_best.pkl\", we can see the performance of the model at the point where it did the best on the validation set.\n", "We see by executing the next cell (the ! mark tells ipython to run a shell command) that the test set misclassification rate is 0.0759, obtained after training for 9 epochs." ] }, { "cell_type": "code", "collapsed": false, "input": [ "!print_monitor.py softmax_regression_best.pkl" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "/u/almahaia/Code/pylearn2/pylearn2/models/mlp.py:40: UserWarning: MLP changing the recursion limit.\r\n", " warnings.warn(\"MLP changing the recursion limit.\")\r\n", "epochs seen: 9\r\n", "time trained: 458.503871202\r\n", "ave_grad_mult : 3.07640978739\r\n", "ave_grad_size : 0.0578164084249\r\n", "ave_step_size : 0.168970324213\r\n", "test_objective : 0.269887997532\r\n", "test_y_col_norms_max : 5.97236412318\r\n", "test_y_col_norms_mean : 5.28605773752\r\n", "test_y_col_norms_min : 4.20493453263\r\n", "test_y_max_max_class : 0.999999964196\r\n", "test_y_mean_max_class : 0.914368745924\r\n", "test_y_min_max_class : 0.232516262448\r\n", "test_y_misclass : 0.0759\r\n", "test_y_nll : 0.269887997532\r\n", "test_y_row_norms_max : 1.61850785481\r\n", "test_y_row_norms_mean : 0.483502165396\r\n", "test_y_row_norms_min : 0.0\r\n", "total_seconds_last_epoch : 52.954261\r\n", "train_objective : 0.255329041014\r\n", "train_y_col_norms_max : 5.97236412318\r\n", "train_y_col_norms_mean : 5.28605773752\r\n", "train_y_col_norms_min : 4.20493453263\r\n", "train_y_max_max_class : 0.999999968304\r\n", "train_y_mean_max_class : 0.911166781743\r\n", "train_y_min_max_class : 0.233844764224\r\n", "train_y_misclass : 0.07102\r\n", "train_y_nll : 0.255329041014\r\n", "train_y_row_norms_max : 1.61850785481\r\n", "train_y_row_norms_mean : 0.483502165396\r\n", "train_y_row_norms_min : 0.0\r\n", "training_seconds_this_epoch : 48.921041\r\n", "valid_objective : 0.260598210855\r\n", "valid_y_col_norms_max : 5.97236412318\r\n", "valid_y_col_norms_mean : 5.28605773752\r\n", "valid_y_col_norms_min : 4.20493453263\r\n", "valid_y_max_max_class : 0.99999995121\r\n", "valid_y_mean_max_class : 0.9160507243\r\n", "valid_y_min_max_class : 0.247938293814\r\n", "valid_y_misclass : 0.0707\r\n", "valid_y_nll : 0.260598210855\r\n", "valid_y_row_norms_max : 1.61850785481\r\n", "valid_y_row_norms_mean : 0.483502165396\r\n", "valid_y_row_norms_min : 0.0\r\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another common way of analyzing trained models is to look at their weights. Here we use the show_weights script to visualize $W$:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!show_weights.py softmax_regression_best.pkl" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "making weights report\r\n", "loading model\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "/u/almahaia/Code/pylearn2/pylearn2/models/mlp.py:40: UserWarning: MLP changing the recursion limit.\r\n", " warnings.warn(\"MLP changing the recursion limit.\")\r\n", "loading done\r\n", "loading dataset...\r\n", "...done\r\n", "smallest enc weight magnitude: 0.0\r\n", "mean enc weight magnitude: 0.121750386838\r\n", "max enc weight magnitude: 1.46967125826\r\n", "min norm: 4.20493453263\r\n", "mean norm: 5.28605773752\r\n", "max norm: 5.97236412318\r\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further reading\n", "\n", "You can find more information on softmax regression from the following sources:\n", "\n", "[LISA lab's Deep Learning Tutorials: Classifying MNIST digits using Logistic Regression](http://deeplearning.net/tutorial/logreg.html)\n", "\n", "[Stanford's Unsupervised Feature Learning and Deep Learning wiki: Softmax Regression](http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression)\n", "\n", "This is by no means a complete list." ] } ], "metadata": {} } ] }