{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic use of Emukit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# General imports and parameters of figures should be loaded at the beginning of the overview\n", "\n", "import numpy as np\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of this notebook is to illustrate how a model can wrapped and used in different tasks in Emukit." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Navigation\n", "\n", "1. [Load your objective, collect some data, build a model](#1.-Load your objective, collect some data, build a model)\n", "2. [Load the elements to solve your problem and run the decision loop if needed](#2.-Load the elements to solve your problem and run the decision loop if needed)\n", "3. [Conclusions](#3.-Conclusions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Load your objective, collect some data, build a model\n", "\n", "These steps are common to all methods in Emukit. Here we illustrate how to do it with the [Branin function](https://www.sfu.ca/~ssurjano/branin.html)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from emukit.test_functions import branin_function\n", "from emukit.core import ParameterSpace, ContinuousParameter\n", "from emukit.core.initial_designs import RandomDesign\n", "from GPy.models import GPRegression\n", "from emukit.model_wrappers import GPyModelWrapper\n", "from emukit.model_wrappers.gpy_quadrature_wrappers import BaseGaussianProcessGPy, RBFGPy\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define the objective function\n", "In this case we use the Branin function available in Emukit." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "f, _ = branin_function()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define the parameter space\n", "The parameter space contains the definition of the input variables of the function. Currently Emukit supports continuous and discrete parameters. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "parameter_space = ParameterSpace([ContinuousParameter('x1', -5, 10), ContinuousParameter('x2', 0, 15)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Collect some observations of f\n", "In this step we are just collecting some initial random points in the parameter space of the objective. These points are used to initialize an emulator of the function. We use the `RandomDesign` class available in Emukit for this." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "num_data_points = 30\n", "design = RandomDesign(parameter_space)\n", "X = design.get_samples(num_data_points)\n", "Y = f(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fit and wrap a model to the collected data\n", "To conclude the steps that are common to **any** method in Emukit we now build an initial emulator of the objective function and we wrap the model in Emukit. In this example we use GPy but note that any modeling framework can be used here." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "model_gpy = GPRegression(X,Y)\n", "model_gpy.optimize()\n", "model_emukit = GPyModelWrapper(model_gpy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Load the package components, run the decision loop (if needed), solve your problem\n", "\n", "In this section we use the model that we have created to solve different decision tasks. When the model is used in a decision loop (Bayesian optimization, Bayesian quadrature, Experimental design) the loop needs to be loaded and elements like acquisitions and optimizer need to be passed in together with other parameters. Sensitivity analysis provide us with tools to interpret the model so no loops are needed." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Decision loops \n", "from emukit.experimental_design import ExperimentalDesignLoop\n", "from emukit.bayesian_optimization.loops import BayesianOptimizationLoop\n", "from emukit.quadrature.loop import VanillaBayesianQuadratureLoop\n", "\n", "# Acquisition functions \n", "from emukit.bayesian_optimization.acquisitions import ExpectedImprovement\n", "from emukit.experimental_design.acquisitions import ModelVariance\n", "from emukit.quadrature.acquisitions import IntegralVarianceReduction\n", "\n", "# Acquistion optimizers\n", "from emukit.core.optimization import GradientAcquisitionOptimizer\n", "\n", "# Stopping conditions\n", "from emukit.core.loop import FixedIterationsStoppingCondition\n", "from emukit.core.loop import ConvergenceStoppingCondition\n", "\n", "# Bayesian quadrature kernel and model\n", "from emukit.quadrature.kernels import QuadratureRBFLebesgueMeasure\n", "from emukit.quadrature.methods import VanillaBayesianQuadrature\n", "from emukit.quadrature.measures import LebesgueMeasure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bayesian optimization\n", "\n", "Here we use the model to find the minimum of the objective using Bayesian optimization in a sequential way. We collect 10 batches of points in batches of size 5." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Load core elements for Bayesian optimization\n", "expected_improvement = ExpectedImprovement(model = model_emukit)\n", "optimizer = GradientAcquisitionOptimizer(space = parameter_space)\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# Create the Bayesian optimization object\n", "bayesopt_loop = BayesianOptimizationLoop(model = model_emukit,\n", " space = parameter_space,\n", " acquisition = expected_improvement,\n", " batch_size = 5)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Run the loop and extract the optimum\n", "# Run the loop until we either complete 10 steps or converge\n", "stopping_condition = FixedIterationsStoppingCondition(i_max = 10) | ConvergenceStoppingCondition(eps=0.01)\n", "\n", "bayesopt_loop.run_loop(f, stopping_condition)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Experimental design" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we use the same model to perform experimental design. We use the model variance. We collect 10 batches of 5 points each. After each batch is collected the model is updated." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Load core elements for Experimental design\n", "model_variance = ModelVariance(model = model_emukit)\n", "optimizer = GradientAcquisitionOptimizer(space = parameter_space)\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "# Create the Experimental design object\n", "expdesign_loop = ExperimentalDesignLoop(space = parameter_space,\n", " model = model_emukit,\n", " acquisition = model_variance,\n", " update_interval = 1,\n", " batch_size = 5) " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Run the loop \n", "stopping_condition = FixedIterationsStoppingCondition(i_max = 10)\n", "expdesign_loop.run_loop(f, stopping_condition)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bayesian Quadrature\n", "\n", "Here we use vanilla BQ from the quadrature package to integrate the Branin function. Note that we have to create a slightly different Emukit model since the BQ package needs some additional information. But we can re-use the GPy GP model from above. We also need to specify the integral bounds, i.e., the domain which we want to integrate over." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# Define the lower and upper bounds of the integral. \n", "integral_bounds = [(-5, 10), (0, 15)]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# Load core elements for Bayesian quadrature\n", "emukit_measure = LebesgueMeasure.from_bounds(integral_bounds)\n", "emukit_qrbf = QuadratureRBFLebesgueMeasure(RBFGPy(model_gpy.kern), emukit_measure)\n", "emukit_model = BaseGaussianProcessGPy(kern=emukit_qrbf, gpy_model=model_gpy)\n", "emukit_method = VanillaBayesianQuadrature(base_gp=emukit_model, X=X, Y=Y)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "# Create the Bayesian quadrature object\n", "bq_loop = VanillaBayesianQuadratureLoop(model=emukit_method)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# Run the loop and extract the integral estimate\n", "num_iter = 5\n", "bq_loop.run_loop(f, stopping_condition=num_iter)\n", "integral_mean, integral_variance = bq_loop.model.integrate()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sensitivity analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compute the sensitivity indexes of a model does not require any loop. We just load the class `MonteCarloSensitivity` and pass the model and the parameter space." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "from emukit.sensitivity.monte_carlo import MonteCarloSensitivity\n", "\n", "# No loop here, compute Sobol indices\n", "senstivity_analysis = MonteCarloSensitivity(model = model_emukit, input_domain = parameter_space)\n", "main_effects, total_effects, _ = senstivity_analysis.compute_effects(num_monte_carlo_points = 10000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Conclusions\n", "\n", "We have run all these methods using the same model and the sample objective function. With Emukit, probabilistic models can be easily tested over multiple decision problems with a very light API." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 2 }