{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Week 8 Day 3: Fitting tools\n", "\n", "## Objectives:\n", "* Look again at fitting tools\n", "* Set up a project in git (review of git, project structure)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Note about crash last time!\n", "\n", "GooFit and LmFit both seem to have compiled code that expects OpenMP - and at least on my Mac, they were linked with different OpenMP libraries (Anaconda's Intel libomp for LmFit, and HomeBrew's libomp for GooFit). So that's why it crashed. I've reinstalled GooFit without OpenMP for now." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### GooFit\n", "\n", "If you really want to try GooFit, you can try this on OSC:\n", "\n", "```\n", "!pip install --user scikit-build cmake\n", "!PATH=$HOME/.local/bin:$PATH pip install --user --verbose goofit\n", "```\n", "\n", "The extra requirements here are partially to ensure it gets the highest level of optimization, and partially requirements that will eventually go away.\n", "\n", "If you are on macOS, scikit-build is broken, you'll need `!pip install scikit-build==0.6.1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.stats import norm, multivariate_normal\n", "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Prepare data\n", "\n", "We'll be making a set of data out of a Gaussian + linear." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "state = np.random.RandomState(42)\n", "gauss_part = state.normal(1, 2, size=100_000)\n", "lin_part = state.uniform(low=-10, high=10, size=50_000)\n", "total_rand = np.concatenate([gauss_part, lin_part])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import goofit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need an observable, with a range from -10 to 10." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = goofit.Observable(\"x\", -10, 10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can make an unbinned (or binned) dataset; we need to list the variables it will contain." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = goofit.UnbinnedDataSet(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's read in data. GooFit will default to throwing an error if you input a value that is outside the range (-10 to 10 in this case), but we can pass `filter=True` to simply ignore those values instead." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data.from_matrix([total_rand], filter=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you display a PDF in the notebook, you get a nice pretty-printed version of it's documentation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "goofit.PolynomialPdf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The fitting variables and PDFs need to be setup next." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = goofit.Variable(\"a\", 0, 0, 1)\n", "linear = goofit.PolynomialPdf(\"linear\", x, [a])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mu = goofit.Variable(\"mu\", 0, -10, 10)\n", "sigma = goofit.Variable(\"sigma\", 1, 0, 5)\n", "gauss = goofit.GaussianPdf(\"gauss\", x, mu, sigma)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can add multiple PDFs with fractions for each" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "frac = goofit.Variable(\"frac\", 0.5, 0, 1)\n", "total = goofit.AddPdf(\"tot\", [frac], [gauss, linear])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, we can fit a PDF to data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "minimum = total.fitTo(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The variables are changed in place (mutated):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(frac)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at a plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "grid, pts = total.evaluatePdf(x)\n", "xvals = grid.to_numpy().flatten()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(6, 4))\n", "ax.hist(total_rand, bins=50, range=(-10, 10), density=True)\n", "ax.plot(xvals, pts, linewidth=2)\n", "# ax.set_xlabel('xvar')\n", "ax.set_ylabel(\"Normalized probability\")\n", "ax.set_ylim(ymin=0)\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "compclass", "language": "python", "name": "compclass" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 4 }