{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Investigating different cost functions\n", "\n", "In this notebook, we take a look at the different fitting cost functions offered in PyBOP. Cost functions for fitting problems conventionally describe the distance between two points (the target and the prediction) which is to be minimised via PyBOP's optimisation algorithms. \n", "\n", "First, we install and import the required packages below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install --upgrade pip ipywidgets -q\n", "%pip install pybop -q\n", "\n", "import numpy as np\n", "\n", "import pybop\n", "\n", "go = pybop.PlotlyManager().go\n", "pybop.PlotlyManager().pio.renderers.default = \"notebook_connected\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's fix the random seed in order to generate consistent output during development, although this does not need to be done in practice." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this notebook, we need to construct parameters, a model and a problem class before we can compare differing cost functions. We start with two parameters, but this is an arbitrary selection and can be expanded given the model and data in question." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "parameters = pybop.Parameters(\n", " pybop.Parameter(\n", " \"Positive electrode thickness [m]\",\n", " prior=pybop.Gaussian(7.56e-05, 0.5e-05),\n", " bounds=[65e-06, 10e-05],\n", " ),\n", " pybop.Parameter(\n", " \"Positive particle radius [m]\",\n", " prior=pybop.Gaussian(5.22e-06, 0.5e-06),\n", " bounds=[2e-06, 9e-06],\n", " ),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we will construct the Single Particle Model (SPM) with the Chen2020 parameter set, but like the above, this is an arbitrary selection and can be replaced with any PyBOP model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "parameter_set = pybop.ParameterSet.pybamm(\"Chen2020\")\n", "model = pybop.lithium_ion.SPM(parameter_set=parameter_set)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, as we will need reference data to compare our model predictions to (via the cost function), we will create synthetic data from the model constructed above. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t_eval = np.arange(0, 900, 10)\n", "values = model.predict(t_eval=t_eval)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then construct the PyBOP dataset class with the synthetic data as," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = pybop.Dataset(\n", " {\n", " \"Time [s]\": t_eval,\n", " \"Current function [A]\": values[\"Current [A]\"].data,\n", " \"Voltage [V]\": values[\"Voltage [V]\"].data,\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can put this all together and construct the problem class. In this situation, we are going to compare differing fitting cost functions, so we construct the `FittingProblem`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "problem = pybop.FittingProblem(model, parameters, dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sum of Squared Errors and Root Mean Squared Error\n", "\n", "First, let's start with two commonly-used cost functions: the sum of squared errors (SSE) and the root mean squared error (RMSE). Constructing these classes is very concise in PyBOP, and only requires the problem class." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cost_SSE = pybop.SumSquaredError(problem)\n", "cost_RMSE = pybop.RootMeanSquaredError(problem)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can investigate how these functions differ when fitting the parameters. To acquire the cost value for each of these, we can simply use the call method of the constructed class, such as:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.1753460077019054e-09" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cost_SSE([7.56e-05, 5.22e-06])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can use the `Parameters` class for this," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[7.56e-05 5.22e-06]\n" ] }, { "data": { "text/plain": [ "1.1753460077019054e-09" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(parameters.current_value())\n", "cost_SSE(parameters.current_value())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to generate a random sample of candidate solutions from the parameter class prior, we can also do that as:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[7.60957550e-05 5.48691392e-06]\n" ] }, { "data": { "text/plain": [ "0.014466627355628724" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample = parameters.rvs()\n", "print(sample)\n", "cost_SSE(sample)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Comparing RMSE and SSE\n", "\n", "Now, let's vary one of the parameters, and keep a fixed value for the other, to create a scatter plot comparing the cost values for the RMSE and SSE functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x_range = np.linspace(4.72e-06, 5.72e-06, 75)\n", "y_SSE = []\n", "y_RMSE = []\n", "for i in x_range:\n", " y_SSE.append(cost_SSE([7.56e-05, i]))\n", " y_RMSE.append(cost_RMSE([7.56e-05, i]))\n", "\n", "fig = go.Figure()\n", "fig.add_trace(go.Scatter(x=x_range, y=y_SSE, mode=\"lines\", name=\"SSE\"))\n", "fig.add_trace(go.Scatter(x=x_range, y=y_RMSE, mode=\"lines\", name=\"RMSE\"))\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this situation, it's clear that the curvature of the SSE cost is greater than that of the RMSE. This can improve the rate of convergence for certain optimisation algorithms. However, with incorrect hyperparameter values, larger gradients can also result in the algorithm not converging due to sampling locations outside of the \"cost valley\", e.g. infeasible parameter values." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Minkowski distance\n", "\n", "Next, let's investigate the Minkowski distance. The Minkowski cost takes a general form, which allows for hyperparameter calibration on the cost function itself, given by\n", "\n", "$\\mathcal{L_p} = \\displaystyle \\Big(\\sum_i |\\hat{y_i}-y_i|^p\\Big)^{1/p}$\n", "\n", "where $p ≥ 0$ is the order of the Minkowski distance.\n", "\n", "For $p = 1$, it is the Manhattan distance. \n", "For $p = 2$, it is the Euclidean distance. \n", "For $p ≥ 1$, the Minkowski distance is a metric, but for $0
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "y_minkowski = []\n", "for i in x_range:\n", " y_minkowski.append(cost_minkowski([7.56e-05, i]))\n", "\n", "fig = go.Figure()\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=np.asarray(y_RMSE) * np.sqrt(len(t_eval)),\n", " mode=\"lines\",\n", " name=\"RMSE*N\",\n", " )\n", ")\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=np.sqrt(y_SSE),\n", " mode=\"lines\",\n", " line=dict(dash=\"dash\"),\n", " name=\"sqrt(SSE)\",\n", " )\n", ")\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range, y=y_minkowski, mode=\"lines\", line=dict(dash=\"dot\"), name=\"Minkowski\"\n", " )\n", ")\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, these lines lie on top of one another. Now, let's take a look at how the Minkowski cost changes for different orders, `p`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p_orders = np.append(0.75, np.linspace(1, 3, 5))\n", "y_minkowski = tuple(\n", " [pybop.Minkowski(problem, p=j)([7.56e-05, i]) for i in x_range] for j in p_orders\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure()\n", "for k, _ in enumerate(p_orders):\n", " fig.add_trace(\n", " go.Scatter(x=x_range, y=y_minkowski[k], mode=\"lines\", name=f\"Minkowski {_}\")\n", " )\n", "fig.update_yaxes(range=[0, np.max(y_minkowski[2])])\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As seen above, the Minkowski cost allows for a range of different cost functions to be created. This provides users with another hyperparameter to calibrate for optimisation algorithm convergence. This addition does expand the global search space, and should be carefully considered before deciding upon.\n", "\n", "### Sum of Power\n", "Next, we introduce a similar cost function, the `SumofPower` implementation. This cost function is the $p$-th power of the Minkowski distance of order $p$. It provides a generalised formulation for the Sum of Squared Errors (SSE) cost function, and is given by,\n", "\n", "$\\mathcal{L_p} = \\displaystyle \\sum_i |\\hat{y_i}-y_i|^p$\n", "\n", "where $p ≥ 0$ is the power order. A few special cases include,\n", "\n", "$p = 1$: Sum of Absolute Differences\n", "$p = 2$: Sum of Squared Differences\n", "\n", "Next we repeat the above examples with the addition of the `SumofPower` class." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cost_sumofpower = pybop.SumofPower(problem, p=2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "y_sumofpower = []\n", "for i in x_range:\n", " y_sumofpower.append(cost_sumofpower([7.56e-05, i]))\n", "\n", "fig = go.Figure()\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=np.asarray(y_RMSE) * np.sqrt(len(t_eval)),\n", " mode=\"lines\",\n", " name=\"RMSE*N\",\n", " )\n", ")\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=y_SSE,\n", " mode=\"lines\",\n", " line=dict(dash=\"dash\"),\n", " name=\"SSE\",\n", " )\n", ")\n", "fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=y_sumofpower,\n", " mode=\"lines\",\n", " line=dict(dash=\"dot\"),\n", " name=\"Sum of Power\",\n", " )\n", ")\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, the `SumofPower` with order `p=2` equates to the `SSE` implementation. Next, we compare the `Minkowski` to the `SumofPower`," ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p_orders = np.append(0.75, np.linspace(1, 2, 2))\n", "\n", "y_minkowski = tuple(\n", " [pybop.Minkowski(problem, p=j)([7.56e-05, i]) for i in x_range] for j in p_orders\n", ")\n", "\n", "y_sumofpower = tuple(\n", " [pybop.SumofPower(problem, p=j)([7.56e-05, i]) for i in x_range] for j in p_orders\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure()\n", "for k, _ in enumerate(p_orders):\n", " fig.add_trace(\n", " go.Scatter(x=x_range, y=y_minkowski[k], mode=\"lines\", name=f\"Minkowski {_}\")\n", " )\n", " fig.add_trace(\n", " go.Scatter(\n", " x=x_range,\n", " y=y_sumofpower[k],\n", " mode=\"lines\",\n", " line=dict(dash=\"dash\"),\n", " name=f\"Sum of Power {_}\",\n", " )\n", " )\n", "fig.update_yaxes(range=[0, 2.5])\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The figure demonstrates the distinct behaviour of the `Minkowski` distance and the `SumofPower` function. One notable difference is the effect of the `1/p` exponent in the `Minkowski` distance, which has a linearising impact on the response. This linearisation can enhance the robustness of certain optimisation algorithms, potentially making them less sensitive to outliers or extreme values. However, this increased robustness may come at the cost of a slower convergence rate, as the linearised response might require more iterations to reach the optimal solution. In contrast, the `SumofPower` function does not exhibit this linearising effect, which can lead to faster convergence in some cases but may be more susceptible to the influence of outliers or extreme values.\n", "\n", "In this notebook, we've shown the different fitting cost functions offered in PyBOP. Selection between these functions can affect the optimisation result in the case that the optimiser hyperparameter values are not properly calibrated. " ] } ], "metadata": { "kernelspec": { "display_name": "pybop-3.12", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.4" } }, "nbformat": 4, "nbformat_minor": 2 }