{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial 1: Basics\n", "\n", "This tutorial will talk about how to use this software from your own python project or Jupyter notebook.\n", "There is also a nice command line interface that enables you to do the same with just two lines in your command line.\n", "\n", "**NOTE FOR CONTRIBUTORS: Always clear all output before commiting (``Cell`` > ``All Output`` > ``Clear``)**!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Magic\n", "%matplotlib inline\n", "# Reload modules whenever they change\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "# Make clusterking package available even without installation\n", "import sys\n", "sys.path = [\"../../\"] + sys.path\n", "\n", "import clusterking as ck" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Scanning\n", "\n", "### Setting it up" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's set up a scanner object and configure it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = ck.scan.WilsonScanner()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we set up the function/distribution that we want to consider. Here we look into the branching ratio with respect to $q^2$ of $B\\to D \\,\\tau\\, \\bar\\nu_\\tau$. The function of the differential branching ration is taken from the flavio package (https://flav-io.github.io/). The $q^2$ binning is chosen to have 9 bins between $3.2 \\,\\text{GeV}^2$ and $11.6\\,\\text{GeV}^2$ and is implemented as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import flavio\n", "import numpy as np\n", "\n", "def dBrdq2(w, q):\n", " return flavio.np_prediction(\"dBR/dq2(B+->Dtaunu)\", w, q)\n", "\n", "s.set_dfunction(\n", " dBrdq2,\n", " binning=np.linspace(3.2, 11.6, 10),\n", " normalize=True\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's set up the Wilson coefficients that need to be sampled. The Wilson coefficients are implemented using the Wilson package (https://wilson-eft.github.io/), which allows to use a variety of bases, EFTs and matches them to user specified scales.\n", "Using the example of $B\\to D \\tau \\bar\\nu_\\tau$, we sample the coefficients ``CVL_bctaunutau``, ``CSL_bctaunutau`` and ``CT_bctaunutau`` from the ``flavio`` basis (https://wcxf.github.io/assets/pdf/WET.flavio.pdf) with 10 points between $-1$ and $1$ at the scale of 5 GeV:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s.set_spoints_equidist(\n", " {\n", " \"CVL_bctaunutau\": (-1, 1, 10),\n", " \"CSL_bctaunutau\": (-1, 1, 10),\n", " \"CT_bctaunutau\": (-1, 1, 10)\n", " },\n", " scale=5,\n", " eft='WET',\n", " basis='flavio'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Running it" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now to compute the kinematical distributions from the Wilson coefficients sampled above we need a data instance:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d = ck.Data()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Computing the kinematical distributions is done using ``run()`` method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s.run(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results are saved in a dataframe, ``d.df``. Let's have a look:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "d.df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Clustering" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's build a hierarchy cluster out of the data object we created above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = ck.cluster.HierarchyCluster(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we have to specify the metric we want to use to measure the distance between different distributions. If no argument is specified, the common $\\chi^2$ metric from is used." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c.set_metric()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's build now the hierarchy cluster:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c.build_hierarchy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The maximal distance between the individual clusters ``max_d`` can be chosen as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "c.cluster(max_d=0.15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we add the information about the clusters to the dataframe created above:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c.write()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look and notice the new column ``cluster`` at the end of the data frame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting benchmark points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In a similar way we can determine the benchmark points representing the individual clusters. Initializing a benchmark point object" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = ck.Benchmark(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and again choosing a metric ($\\chi^2$ metric is default)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.set_metric()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "the benchmark points can be computed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.select_bpoints()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and written in the dataframe:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.write()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look and notice the new column ``bpoint`` at the end of the data frame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preserving results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now it's time to write out the results for later use." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.write(\"output/cluster\", \"tutorial_basics\", overwrite=\"overwrite\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will not only write out the data itself, but also a lot of associated metadata that makes it easy to later reconstruct what the data actually represents. This was accumulated in the attribute ``d.md`` over all steps:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.md" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 1 }