{ "metadata": { "name": "Partial-sampling in hyperopt" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "How can we sample *some* of the variables in a pyll configuration space, while assigning values to the others?\n", "\n", "Let's look at a simple example involving 2 variables 'a' and 'b'.\n", "The 'a' variable controls whether our space returns -1 or some random number, 'b'.\n", "\n", "If we just run optimization normally, then we'll find that 'a' should be 0 (the index of the choice that gives the lowest\n", "return value." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from hyperopt import hp, fmin, rand\n", "space = hp.choice('a', [-1, hp.uniform('b', 0, 1)])\n", "best = fmin(fn=lambda x: x, space=space, algo=rand.suggest, max_evals=100)\n", "print best" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{'a': 0}\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But what if someone else already set up the space, and we just run the search over the other part of the space, which corresponds to the uniform draw?\n", "\n", "The easiest way to do this is probably to *clone* the search space, while making some substitutions while we're at it.\n", "We can just make a new search space in which 'a' is no longer a hyperparameter." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# put the configuration space in a local var\n", "# so that we can work on it.\n", "print space" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 switch\n", "1 hyperopt_param\n", "2 Literal{a}\n", "3 randint\n", "4 Literal{2}\n", "5 rng =\n", "6 Literal{}\n", "7 Literal{-1}\n", "8 float\n", "9 hyperopt_param\n", "10 Literal{b}\n", "11 uniform\n", "12 Literal{0}\n", "13 Literal{1}\n", "14 rng =\n", "15 Literal{} [line:6]\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The transformation we want to make on the search space is to replace the `randint` with a constant value of 1, \n", "corresponding to always choosing hyperparameter a to be the second element of the list of choices.\n", "\n", "Now, if you don't have access to the code that generated a search space, then you'll have to go digging around for the\n", "node you need to replace. There are two approaches you can use to do this: navigation and search." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from hyperopt import pyll\n", "\n", "# The \"navigation\" approach to finding an internal\n", "# search space node:\n", "randint_node_nav = space.pos_args[0].pos_args[1]\n", "print \"by navigation:\"\n", "print randint_node_nav\n", "\n", "# The \"search\" approach to finding an internal\n", "# search space node:\n", "randint_nodes = [node for node in pyll.dfs(space) if node.name == 'randint']\n", "randint_node_srch, = randint_nodes\n", "print \"by search:\"\n", "print randint_node_srch\n", "\n", "assert randint_node_nav == randint_node_srch" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "by navigation:\n", "0 randint\n", "1 Literal{2}\n", "2 rng =\n", "3 Literal{}\n", "by search:\n", "0 randint\n", "1 Literal{2}\n", "2 rng =\n", "3 Literal{}\n" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "space_with_fixed_a = pyll.clone(space, memo={randint_node_nav: pyll.as_apply(1)})\n", "print space_with_fixed_a" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 switch\n", "1 hyperopt_param\n", "2 Literal{a}\n", "3 Literal{1}\n", "4 Literal{-1}\n", "5 float\n", "6 hyperopt_param\n", "7 Literal{b}\n", "8 uniform\n", "9 Literal{0}\n", "10 Literal{1}\n", "11 rng =\n", "12 Literal{}\n" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, having cloned the space with a new term for the randint, we can search the new space. I wasn't sure if this would work because I haven't really tested the use of hyperopt_params that wrap around non-random nodes (here we replaced the randint with a constant) but it works for random search:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "best = fmin(fn=lambda x: x, space=space_with_fixed_a, algo=rand.suggest, max_evals=100)\n", "print best" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{'a': 1, 'b': 0.013493404710812285}\n" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yep, sure enough: The TPE implementation is broken by a hyperparameter that turns out to be a constant. At implementation time, that was not part of the plan." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from hyperopt import tpe\n", "best = fmin(fn=lambda x: x, space=space_with_fixed_a, algo=tpe.suggest, max_evals=100)\n", "print best" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "KeyError", "evalue": "'asarray'", "output_type": "pyerr", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mhyperopt\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mtpe\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mbest\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfmin\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mx\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mspace\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mspace_with_fixed_a\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0malgo\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mtpe\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msuggest\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmax_evals\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m100\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[1;32mprint\u001b[0m \u001b[0mbest\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/fmin.pyc\u001b[0m in \u001b[0;36mfmin\u001b[1;34m(fn, space, algo, max_evals, trials, rseed)\u001b[0m\n\u001b[0;32m 326\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 327\u001b[0m \u001b[0mrval\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mFMinIter\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0malgo\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdomain\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtrials\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mmax_evals\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mmax_evals\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 328\u001b[1;33m \u001b[0mrval\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mexhaust\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 329\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mtrials\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0margmin\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 330\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/fmin.pyc\u001b[0m in \u001b[0;36mexhaust\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 289\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mexhaust\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 290\u001b[0m \u001b[0mn_done\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtrials\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 291\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mrun\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmax_evals\u001b[0m \u001b[1;33m-\u001b[0m \u001b[0mn_done\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mblock_until_done\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0masync\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 292\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtrials\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mrefresh\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 293\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/fmin.pyc\u001b[0m in \u001b[0;36mrun\u001b[1;34m(self, N, block_until_done)\u001b[0m\n\u001b[0;32m 244\u001b[0m print 'trial %i %s %s' % (d['tid'], d['state'],\n\u001b[0;32m 245\u001b[0m d['result'].get('status'))\n\u001b[1;32m--> 246\u001b[1;33m \u001b[0mnew_trials\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0malgo\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnew_ids\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdomain\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtrials\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 247\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mnew_trials\u001b[0m \u001b[1;32mis\u001b[0m \u001b[0mbase\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mStopExperiment\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 248\u001b[0m \u001b[0mstopped\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mTrue\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/tpe.pyc\u001b[0m in \u001b[0;36msuggest\u001b[1;34m(new_ids, domain, trials, seed, prior_weight, n_startup_jobs, n_EI_candidates, gamma, linear_forgetting)\u001b[0m\n\u001b[0;32m 797\u001b[0m \u001b[0mt0\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mtime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtime\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 798\u001b[0m \u001b[1;33m(\u001b[0m\u001b[0ms_prior_weight\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mobserved\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mobserved_loss\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mspecs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mopt_idxs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mopt_vals\u001b[0m\u001b[1;33m)\u001b[0m\u001b[0;31m \u001b[0m\u001b[0;31m\\\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 799\u001b[1;33m \u001b[1;33m=\u001b[0m \u001b[0mtpe_transform\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdomain\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mprior_weight\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgamma\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 800\u001b[0m \u001b[0mtt\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mtime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtime\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m-\u001b[0m \u001b[0mt0\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 801\u001b[0m \u001b[0mlogger\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0minfo\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'tpe_transform took %f seconds'\u001b[0m \u001b[1;33m%\u001b[0m \u001b[0mtt\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/tpe.pyc\u001b[0m in \u001b[0;36mtpe_transform\u001b[1;34m(domain, prior_weight, gamma)\u001b[0m\n\u001b[0;32m 772\u001b[0m \u001b[0mobserved_loss\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'vals'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 773\u001b[0m \u001b[0mpyll\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mLiteral\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mgamma\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 774\u001b[1;33m \u001b[0ms_prior_weight\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 775\u001b[0m )\n\u001b[0;32m 776\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m/home/bergstra/.VENV/eccv12/src/hyperopt/hyperopt/tpe.pyc\u001b[0m in \u001b[0;36mbuild_posterior\u001b[1;34m(specs, prior_idxs, prior_vals, obs_idxs, obs_vals, oloss_idxs, oloss_vals, oloss_gamma, prior_weight)\u001b[0m\n\u001b[0;32m 654\u001b[0m \u001b[0mobs_below\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mobs_above\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mobs_memo\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mnode\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 655\u001b[0m \u001b[0maa\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mmemo\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0ma\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0ma\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mnode\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpos_args\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 656\u001b[1;33m \u001b[0mfn\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0madaptive_parzen_samplers\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mnode\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mname\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 657\u001b[0m \u001b[0mb_args\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mobs_below\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mprior_weight\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0maa\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 658\u001b[0m named_args = [[kw, memo[arg]]\n", "\u001b[1;31mKeyError\u001b[0m: 'asarray'" ] } ], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The TPE algorithm works if we make a different replacement in the graph. If we replace the entire \"hyperopt_param\" node corresponding to hyperparameter \"a\", then it works fine." ] }, { "cell_type": "code", "collapsed": false, "input": [ "space_with_no_a = pyll.clone(space, memo={space.pos_args[0]: pyll.as_apply(1)})\n", "best = fmin(fn=lambda x: x, space=space_with_no_a, algo=tpe.suggest, max_evals=100)\n", "print best" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "{'b': 0.000269455723739237}\n" ] } ], "prompt_number": 21 } ], "metadata": {} } ] }