{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# GPyOpt: The tool for Bayesian Optimization \n", "\n", "### Written by Javier Gonzalez, Amazon Reseach Cambridge, UK.\n", "\n", "## Reference Manual index\n", "\n", "*Last updated Monday, 22 May 2017.*\n", "\n", "=====================================================================================================\n", "\n", "1. **What is GPyOpt?**\n", "\n", "2. **Installation and setup**\n", "\n", "3. **First steps with GPyOpt and Bayesian Optimization**\n", "\n", "4. **Alternative GPyOpt interfaces: Standard, Modular and Spearmint**\n", "\n", "5. **What can I do with GPyOpt?**\n", " 1. Bayesian optimization with arbitrary restrictions.\n", " 2. Parallel Bayesian optimization.\n", " 3. Mixing different types of variables.\n", " 4. Armed bandits problems.\n", " 5. Tuning Scikit-learn models.\n", " 6. Integrating model hyperparameters.\n", " 7. Input warping.\n", " 8. Using various cost evaluations functions.\n", " 9. Contextual variables.\n", " 10. External objective evaluation.\n", " \n", "6. **Currently supported models, acquisitions and initial designs**\n", " 1. Supported initial designs.\n", " 2. Implementing new models.\n", " 3. Implementing new acquisistion.\n", "\n", "=====================================================================================================\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. What is GPyOpt?\n", "\n", "[GPyOpt](http://sheffieldml.github.io/GPy/) is a tool for optimization (minimization) of black-box functions using Gaussian processes. It has been implemented in [Python](https://www.python.org/download/releases/2.7/) by the [group of Machine Learning](http://ml.dcs.shef.ac.uk/sitran/) (at SITraN) of the University of Sheffield. \n", "\n", "GPyOpt is based on [GPy](https://github.com/SheffieldML/GPy), a library for Gaussian process modeling in Python. [Here](http://nbviewer.ipython.org/github/SheffieldML/notebook/blob/master/GPy/index.ipynb) you can also find some notebooks about GPy functionalities. GPyOpt is a tool for Bayesian Optimization but we also use it for academic dissemination in [Gaussian Processes Summer Schools](gpss.cc), where you can find some extra labs and a variety of talks on Gaussian processes and Bayesian optimization.\n", "\n", "The purpose of this manual is to provide a guide to use GPyOpt. The framework is [BSD-3 licensed](https://opensource.org/licenses/BSD-3-Clause) and we welcome collaborators to develop new functionalities. If you have any question or suggestions about the notebooks, please write an issue in the [GitHub repository](https://github.com/SheffieldML/GPyOpt)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Installation and setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The simplest way to install GPyOpt is using pip. Ubuntu users can do:\n", "\n", "```\n", "sudo apt-get install python-pip\n", "pip install gpyopt\n", "```\n", "\n", "If you'd like to install from source, or want to contribute to the project (e.g. by sending pull requests via github), read on. Clone the repository in GitHub and add it to your $PYTHONPATH.\n", "\n", "\n", "```\n", "git clone git@github.com:SheffieldML/GPyOpt.git ~/SheffieldML\n", "echo 'PYTHONPATH=$PYTHONPATH:~/SheffieldML' >> ~/.bashrc\n", "```\n", "\n", "There are a number of dependencies that you may need to install. Three of them are needed to ensure the good behaviour of the package. These are, GPy, numpy and scipy. Other dependencies, such as DIRECT, cma and pyDOE are optional and only are required for in some options of the module. All of them are pip installable.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. First steps with GPyOpt and Bayesian Optimization\n", "\n", "The tutorial [Introduction to Bayesian Optimization with GPyOpt](./GPyOpt_reference_manual.ipynb) reviews some basic concepts on Bayesian optimization and shows some basic GPyOpt functionalities. It is a manual for beginners who want to start using the package.\n", "\n", "\"Drawing\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Alternative GPyOpt interfaces: Standard, Modular and Spearmint\n", "\n", "GPyOpt has different interfaces oriented to different types of users. Apart from the general interface (detailed in the introductory manual) you can use GPyOpt in a modular way: you can implement and use your some elements of the optimization process, such us a new model or acquisition function, but still use the main backbone of the package. You can check the [GPyOpt: Modular Bayesian Optimization](./GPyOpt_modular_bayesian_optimization.ipynb) notebook if you are interested on using GPyOpt this way. \n", "\n", "Also, we have developed and GPyOpt interface with Spearmint but this only covers some general features that are available in GPyOpt." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. What can I do with GPyOpt?\n", "\n", "There are several options implemented in GPyOpt that allows to cover a wide range of specific optimization problems. We have implemented a collection of notebooks to explain these functionalities separately but they can be easily combined. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.1. Bayesian optimization with arbitrary restrictions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With GPyOpt you can solve optimization problems with arbitrary non trivial restrictions. Have a look to the notebook [GPyOpt: Bayesian Optimization with fixed constraints](./GPyOpt_constrained_optimization.ipynb) if you want to know more about how to use GPyOpt in these type of problems." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.2 Parallel Bayesian optimization\n", "The main bottleneck when using Bayesian optimization is the cost of evaluating the objective function. In the notebook [GPyOpt: parallel Bayesian Optimization](GPyOpt_parallel_optimization.ipynb) you can learn more about the different parallel methods currently implemented in GPyOpt. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.3 Mixing different types of variables\n", "In GPyOpt you can easily combine different types of variables in the optimization. Currently you can use discrete an continuous variables. The way GPyOpt handles discrete variables is by marginally optimizing the acquisition functions over combinations of feasible values. This may slow down the optimization if many discrete variables are used but it avoids rounding errors. See the notebook entitled [GPyOpt: mixing different types of variables](./GPyOpt_mixed_domain.ipynb) for further details. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.4 Armed bandits problems\n", "\n", "Armed bandits optimization problems are a particular case of Bayesian Optimization that appear when the domain of the function to optimize is entirely discrete. This has several advantages with respect to optimize in continuous domains. The most remarkable is that the optimization of the acquisition function can be done by taking the $arg min$ of all candidate points while the rest of the BO theory applies. In the notebook [GPyOpt: armed bandits optimization](./GPyOpt_bandits_optimization.ipynb) you can check how to use GPyOpt in these types of problems. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.5 Tuning scikit-learn models\n", "\n", "[Scikit-learn](http://scikit-learn.org/stable/) is a very popular library with a large variety of useful methods in Machine Learning. Have a look to the notebook [GPyOpt: configuring Scikit-learn methods](GPyOpt_scikitlearn.ipynb) to learn how learn the parameters of Scikit-learn methods using GPyOpt. You will learn how to automatically tune the parameters of a Support Vector Regression." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.6 Integrating the model hyper parameters\n", "Maximum Likelihood estimation can be a very instable choice to tune the surrogate model hyper parameters, especially in the fist steps of the optimization. When using a GP model as a surrogate of your function you can integrate the most common acquisition functions with respect to the parameters of the model. Check the notebook [GPyOpt: integrating model hyperparameters](./GPyOpt_integrating_model_hyperparameters.ipynb) to check how to use this option.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.7 Input warping" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.8 Using various cost evaluation functions\n", "The cost of evaluating the objective can be a crucial factor in the optimization process. Check the notebook [GPyOpt: dealing with cost functions](./GPyOpt_cost_functions.ipynb) to learn how to use arbitrary cost functions, including the objective evaluation time." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.9 Contextual variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "During the optimization phase, you may want to fix the value of some of the variables. These variables are called context as they are part of the objective but are fixed when the aquisition is optimized, you can learn how to use them in [this](GPyOpt_context.ipynb) notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5.10 External objective evaluation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you cannot define your objective function in Python, you have an option of evaluating it externally, and calling GPyOpt to suggest the next locations to evaluate. This approach is illustrated [here](GPyOpt_external_objective_evaluation.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## 6. Currently supported models and acquisitions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Currently, you can initialize your model with three types of initial designs: \n", "- Random, \n", "- Latin Hypercubes and \n", "- Sobol sequences. \n", "\n", "Check [this](GPyOpt_initial_design.ipynb) notebook to check how these designs look in a small example. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.1 Implemeting new models\n", "\n", "The currently available models in GPyOpt are:\n", "\n", "- Standard GPs (with MLE and HMC inference)\n", "- Sparse GPs\n", "- Warperd GPs (both for the input and the output)\n", "- Random Forrests\n", "\n", "On top of this, if you want to implement your own model you can learn how to that in [this](GPyOpt_creating_new_models.ipynb) notebook.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.2 Implementing new acquisitions" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "The currently available acquisition functions in GPyOpt are:\n", "\n", "- Expected Improvement.\n", "- Maximum Probability of Improvement.\n", "- Lower Confidence Bound.\n", "\n", "On top of this, if you want to implement your own model you can learn how to that in [this](GPyOpt_creating_new_aquisitions.ipynb) notebook.\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 1 }