{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Algorithms: low-level interface\n",
    "Once we have data for GST, there are several *algorithms* we can run on it to produce tomographic estimates. Depending on the amount of data you have, and time available for running LinearOperator Set Tomography, one algorithm may be preferable over the others.  **What is typically thought of as \"standard GST\" is the iterative maximum-likelihood optimization implemented by `do_iterative_mlgst`** which uses a combination of minimum-$\\chi^2$ GST and maximum-likelihood GST.\n",
    "\n",
    "`pygsti` provides support for the following \"primary\" GST algorithms:\n",
    "\n",
    "* **Linear LinearOperator Set Tomography (LGST)**: Uses short operation sequences to quickly compute a rough (low accuracy) estimate of a model by linear inversion.\n",
    "\n",
    "* **Extended Linear LinearOperator Set Tomography (eLGST or EXLGST)**: Minimizes the sub-of-squared errors between independent LGST estimates and the estimates obtained from a single model to find a best-estimate model.  This is typically done in an interative fashion, using LGST estimates for longer and longer sequences. \n",
    "\n",
    "* **Minimum-$\\chi^2$ LinearOperator Set Tomography (MC2GST)**: Minimizes the $\\chi^{2}$ statistic of the data frequencies and model probabilities to find a best-estimate model.  Typically done in an interative fashion, using successively larger sets of longer and longer operation sequences. \n",
    "\n",
    "* **Maximum-Likelihood LinearOperator Set Tomography (MLGST)**: Maximizes the log-likelihood statistic of the data frequencies and model probabilities to find a best-estimate model.  Typically done in an interative fashion similar to MC2GST.  This maximum likelihood estimation (MLE) is very well-motivated from a statistics standpoint and should be the **most accurate among the algorithms**. \n",
    "\n",
    "If you're curious, the implementation of the algorithms for LGST, EXLGST, MC2GST, and MLGST may be found in the `pygsti.algorithms.core` module.  In this tutorial, we'll show how to invoke each of these algorithms.\n",
    "\n",
    "Additionally, `pygsti` contains **gauge-optimization** algorithms.  Because the outcome data (the input to the GST algorithms above) only determines a model up to some number of un-physical \"gauge\" degrees of freedom, it is often desirable to optimize the `Model` estimate obtained from a GST algorithm within the space of its gauge freedoms.  This process is called \"gauge-optimization\" and the final part of this tutorial demonstrates how to gauge-optimize a model using various criteria."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Primary GST Algorithms\n",
    "The ingredients needed to as input to the \"primary\" GST algorithms are:\n",
    "- a \"target\" `Model` which defines the desired gates.  This model is used by LGST to specify the various gate, state preparation, POVM effect, and SPAM labels, as well as to provide an initial guess for the *gauge* degrees of freedom.\n",
    "- a `DataSet` containing the data that GST attempts to fit using the probabilities generated by a single `Model`.  This data set must at least contain the data for the operation sequences required by the algorithm that is chosen.\n",
    "- for EXLGST, MC2GST, and MLGST, a list-of-lists of `Circuit` objects, which specify which operation sequences are used during each iteration of the algorithm (the length of the top-level list defines the number of interations).  Note that which operation sequences are included in these lists is different for EXLGST than it is for MC2GST and MLGST."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pygsti\n",
    "import json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "#We'll use the standard I, X(pi/2), Y(pi/2) model that we generated data for in the DataSet tutorial\n",
    "from pygsti.modelpacks import smq1Q_XYI\n",
    "\n",
    "target_model = smq1Q_XYI.target_model()\n",
    "prep_fiducials = smq1Q_XYI.prep_fiducials()\n",
    "meas_fiducials = smq1Q_XYI.meas_fiducials()\n",
    "germs = smq1Q_XYI.germs()\n",
    "\n",
    "maxLengthList = [1,2,4,8,16] #for use in iterative algorithms\n",
    "\n",
    "ds = pygsti.io.load_dataset(\"../../tutorial_files/Example_Dataset.txt\", cache=True)\n",
    "dsLowCounts = pygsti.io.load_dataset(\"../../tutorial_files/Example_Dataset_LowCnts.txt\", cache=True)\n",
    "\n",
    "depol_model = target_model.depolarize(op_noise=0.1)\n",
    "\n",
    "print(\"Loaded target model with operation labels: \", target_model.operations.keys())\n",
    "print(\"Loaded fiducial lists of lengths: \", (len(prep_fiducials),len(meas_fiducials)))\n",
    "print(\"Loaded dataset of length: \", len(ds))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using LGST to get an initial estimate\n",
    "\n",
    "An important and distinguising property of the LGST algorithm is that it does *not* require an initial-guess `Model` as an input.  It uses linear inversion and short sequences to obtain a rough model estimate.  As such, it is very common to use the LGST estimate as the initial-guess starting point for more advanced forms of GST."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Run LGST to get an initial estimate for the gates in target_model based on the data in ds\n",
    "\n",
    "#run LGST\n",
    "mdl_lgst = pygsti.do_lgst(ds, prep_fiducials, meas_fiducials, targetModel=target_model, verbosity=1)\n",
    "\n",
    "#Gauge optimize the result to match the target model\n",
    "mdl_lgst_after_gauge_opt = pygsti.gaugeopt_to_target(mdl_lgst, target_model, verbosity=1)\n",
    "\n",
    "#Contract the result to CPTP, guaranteeing that the gates are CPTP\n",
    "mdl_clgst = pygsti.contract(mdl_lgst_after_gauge_opt, \"CPTP\", verbosity=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(mdl_lgst)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extended LGST (eLGST or EXLGST)\n",
    "EXLGST requires a list-of-lists of operation sequences, one per iteration.  The elements of these lists are typically repetitions of short \"germ\" strings such that the final strings does not exceed some maximum length.  We created such lists in the operation sequence tutorial.  Now, we just load these lists from the text files they were saved in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "#Get rho and E specifiers, needed by LGST\n",
    "elgstListOfLists = pygsti.construction.make_elgst_lists(target_model, germs, maxLengthList)\n",
    "           \n",
    "#run EXLGST.  The result, mdl_exlgst, is a Model containing the estimated quantities\n",
    "mdl_exlgst = pygsti.do_iterative_exlgst(ds, mdl_clgst, prep_fiducials, meas_fiducials, elgstListOfLists,\n",
    "                                       targetModel=target_model, verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Minimum-$\\chi^2$ GST (MC2GST)\n",
    "MC2GST and MLGST also require a list-of-lists of operation sequences, one per iteration.  However, the elements of these lists are typically repetitions of short \"germ\" strings *sandwiched between fiducial strings* such that the repeated-germ part of the string does not exceed some maximum length.  We created such lists in the operation sequence tutorial.  Now, we just load these lists from the text files they were saved in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Get lists of operation sequences for successive iterations of LSGST to use\n",
    "lsgstListOfLists = pygsti.construction.make_lsgst_lists(target_model, prep_fiducials, meas_fiducials, germs, maxLengthList)\n",
    "  \n",
    "#run MC2GST.  The result is a Model containing the estimated quantities\n",
    "mdl_mc2 = pygsti.do_iterative_mc2gst(ds, mdl_clgst, lsgstListOfLists, verbosity=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Write the resulting EXLGST and MC2GST results to model text files for later reference.\n",
    "pygsti.io.write_model(mdl_exlgst, \"../../tutorial_files/Example_eLGST_Gateset.txt\",\"# Example result from running eLGST\")\n",
    "pygsti.io.write_model(mdl_mc2,  \"../../tutorial_files/Example_MC2GST_Gateset.txt\",\"# Example result from running MC2GST\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Run MC2GST again but use a DataSet with a lower number of counts \n",
    "mdl_mc2_lowcnts = pygsti.do_iterative_mc2gst(dsLowCounts, mdl_clgst, lsgstListOfLists, verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Maximum Likelihood GST (MLGST)\n",
    "Executing MLGST is very similar to MC2GST: the same operation sequence lists can be used and calling syntax is nearly identitcal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#run MLGST.  The result is a Model containing the estimated quantities\n",
    "mdl_mle = pygsti.do_iterative_mlgst(ds, mdl_clgst, lsgstListOfLists, verbosity=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "#Run MLGST again but use a DataSet with a lower number of counts \n",
    "mdl_mle_lowcnts = pygsti.do_iterative_mlgst(dsLowCounts, mdl_clgst, lsgstListOfLists, verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Gauge Optimization\n",
    "All gauge optimization algorithms perform essentially the same task - to find the model which optimizes some objective function from within the set or space of models that are gauge-equivalent to some starting set.  This is accomplished in `pygsti` using the following mechanism:\n",
    "- one begins with an initial `ExplicitOpModel`, call it `mdl`, to be gauge-optimized.\n",
    "- a `pygsti.objects.GaugeGroup` instance defines a parameterized group of allowable gauge transformations.  This gauge group must be compatible with the `mdl`'s parameterization, so that `mdl.transform` (which calls `LinearOperator.transform` and `SPAMVec.transform`) is able to process elements of the `GaugeGroup` (obtained via a call to `GaugeGroup.get_element(params)`).  That is, the gauge transformation must map between models with the *same* parameterization (that give by `mdl`).  Because of the close interplay between a model's parameterization and its allowed gauge transformations, `Model` objects can contain a `GaugeGroup` instance as their `default_gauge_group` member. In many circumstances, `mdl.default_gauge_group` is set to the correct gauge group to use for a given `Model`.\n",
    "- `pygsti.gaugeopt_custom(...)` takes an intial `ExplicitOpModel`, an objective function, and a `GaugeGroup` (along with other optimization parameters) and returns a gauge-optimized `ExplicitOpModel`.  Note that if its `gauge_group` argument is left as `None`, then the model's default gauge group is used.  And objective function which takes a single `ExplicitOpModel` argument and returns a float can be supplied, giving the user a fair amount of flexiblity.\n",
    "- since usually the objective function is one which compares the model being optimized to a fixed \"target\" model, `pygsti.gaugeopt_to_target(...)` is a routine able to perform these common types of gauge optimization.  Instead of an objective function, `gaugeopt_to_target` takes a target `ExplicitOpModel` and additional arguments (see below) from which it constructs a objective function and then calls `gaugeopt_custom`.  It is essetially a convenience routine for constructing common gauge optimization objective functions.  Relevant arguments which affect what objective function is used are:\n",
    "  - `targetModel` : the `ExplicitOpModel` to compare against - i.e., the one you want to gauge optimize toward. **Note that this doesn't have to be a set of ideal gates **- it can be any (imperfect) model that reflects your expectations about what the estimates should look like.\n",
    "  - `itemWeights` : a dictionary of weights allowing different gates and/or SPAM operations to be weighted differently when computing the objective function's value.\n",
    "  - `CPpenalty` : a prefactor multiplying the sum of all the negative Choi-matrix eigenvalues corresponding to each of the gates.\n",
    "  - `TPpenalty` : a prefactor multiplying the sum of absoulte-value differences between the first row of each operation matrix and `[1 0 ... 0 ]` and the discrpance between the first element of each state preparation vector and its expected value.\n",
    "  - `validSpamPenalty` : a prefactor multiplying penalty terms enforcing the non-negativity of state preparation eigenavlues and that POVM effect eigenvalues lie between 0 and 1.\n",
    "  - `gatesMetric` : how to compare corresponding gates in the gauge-optimized and target sets. `\"frobenius`\" uses the frobenius norm (weighted before taking the final sqrt), `\"fidelity\"` uses the *squared process infidelity* (squared to avoid negative-infidelity issues in non-TP models), and `\"tracedist\"` uses the trace distance (weighted after computing the trace distance between corresponding gates).\n",
    "  - `spamMetric` : how to compare corresponding SPAM vectors. `\"frobenius\"` (the default) should be used here, as `\"fidelity\"` and `\"tracedist\"` compare the \"SPAM gates\" -- the outer product of state prep and POVM effect vectors -- which isn't a meaningful metric.\n",
    "  \n",
    "The cell below demonstrates some of common usages of `gaugeopt_to_target`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "mdl = mdl_mle.copy() #we'll use the MLGST result from above as an example\n",
    "mdl_go1 = pygsti.gaugeopt_to_target(mdl, target_model) # optimization to the perfect target gates\n",
    "mdl_go2 = pygsti.gaugeopt_to_target(mdl, depol_model) # optimization to a \"guess\" at what the estimate should be\n",
    "mdl_go3 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01}) \n",
    "  # weight the gates differently from the SPAM operations\n",
    "mdl_go4 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01, 'Gx': 10.0, 'E0': 0.001}) \n",
    "  # weight an individual gate/SPAM separately (note the specific gate/SPAM labels always override\n",
    "  # the more general 'gates' and 'spam' weight values). \n",
    "mdl_go5 = pygsti.gaugeopt_to_target(mdl, target_model, gatesMetric=\"tracedist\") #use trace distance instead of frobenius\n",
    "\n",
    "print(\"Default gauge group = \",type(mdl.default_gauge_group)) # default is FullGaugeGroup\n",
    "mdl_go6 = pygsti.gaugeopt_to_target(mdl, target_model, gauge_group=pygsti.objects.UnitaryGaugeGroup(mdl.dim, 'pp'))\n",
    "  #gauge optimize only over unitary gauge transformations\n",
    "\n",
    "print(\"\\ngaugeopt_to_target output:\")\n",
    "mdl_go7 = pygsti.gaugeopt_to_target(mdl, target_model, verbosity=3) # show output\n",
    "print(\"Final frobenius distance between mdl_go7 and target_model = \", mdl_go7.frobeniusdist(target_model))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compare MLGST with MC2GST (after gauge optimization)\n",
    "\n",
    "Both MLGST and MC2GST use a $\\chi^{2}$ optimization procedure for all but the final iteration. For the last set of circuits (the last iteration), MLGST uses a maximum likelihood estimation. Below, we show how close the two estimates are to one another.  Before making the comparison, however, we **optimize the gauge** so the estimated gates are as close to the target gates as the gauge degrees of freedom allow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We optimize over the model gauge\n",
    "mdl_mle         = pygsti.gaugeopt_to_target(mdl_mle,depol_model)\n",
    "mdl_mle_lowcnts = pygsti.gaugeopt_to_target(mdl_mle_lowcnts,depol_model)\n",
    "mdl_mc2         = pygsti.gaugeopt_to_target(mdl_mc2,depol_model)\n",
    "mdl_mc2_lowcnts = pygsti.gaugeopt_to_target(mdl_mc2_lowcnts,depol_model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Frobenius diff btwn MLGST  and datagen = {0}\".format(round(mdl_mle.frobeniusdist(depol_model), 6)))\n",
    "print(\"Frobenius diff btwn MC2GST and datagen = {0}\".format(round(mdl_mc2.frobeniusdist(depol_model), 6)))\n",
    "print(\"Frobenius diff btwn MLGST  and LGST    = {0}\".format(round(mdl_mle.frobeniusdist(mdl_clgst), 6)))\n",
    "print(\"Frobenius diff btwn MLGST  and MC2GST  = {0}\".format(round(mdl_mle.frobeniusdist(mdl_mc2), 6)))\n",
    "print(\"Chi^2 ( MC2GST ) = {0}\".format(round(pygsti.chi2(mdl_mc2, ds, lsgstListOfLists[-1]), 4)))\n",
    "print(\"Chi^2 ( MLGST )  = {0}\".format(round(pygsti.chi2(mdl_mle, ds, lsgstListOfLists[-1] ), 4)))\n",
    "print(\"LogL  ( MC2GST ) = {0}\".format(round(pygsti.logl(mdl_mc2, ds, lsgstListOfLists[-1]), 4)))\n",
    "print(\"LogL  ( MLGST )  = {0}\".format(round(pygsti.logl(mdl_mle, ds, lsgstListOfLists[-1]), 4)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that, as expected, the MC2GST estimate has a *slightly* lower $\\chi^{2}$ score than the MLGST estimate, and the MLGST estimate has a *slightly* higher loglikelihood than the MC2GST estimate. In addition, _both_ are close (in terms of the Frobenius difference) to the depolarized model. Which is good - it means GST is giving us estimates which are close to the _true_ model used to generate the data.  Performing the same analysis with the low-count data shows larger differences between the two, which is expected since the $\\chi^2$ and loglikelihood statistics are more similar at large $N$, that is, for large numbers of samples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"LOW COUNT DATA:\")\n",
    "print(\"Frobenius diff btwn MLGST  and datagen = {0}\".format(round(mdl_mle_lowcnts.frobeniusdist(depol_model), 6)))\n",
    "print(\"Frobenius diff btwn MC2GST and datagen = {0}\".format(round(mdl_mc2_lowcnts.frobeniusdist(depol_model), 6)))\n",
    "print(\"Frobenius diff btwn MLGST  and LGST    = {0}\".format(round(mdl_mle_lowcnts.frobeniusdist(mdl_clgst), 6)))\n",
    "print(\"Frobenius diff btwn MLGST  and MC2GST  = {0}\".format(round(mdl_mle_lowcnts.frobeniusdist(mdl_mc2), 6)))\n",
    "print(\"Chi^2 ( MC2GST )  = {0}\".format(round(pygsti.chi2(mdl_mc2_lowcnts, dsLowCounts, lsgstListOfLists[-1]), 4)))\n",
    "print(\"Chi^2 ( MLGST )   = {0}\".format(round(pygsti.chi2(mdl_mle_lowcnts, dsLowCounts, lsgstListOfLists[-1] ), 4)))\n",
    "print(\"LogL  ( MC2GST )  = {0}\".format(round(pygsti.logl(mdl_mc2_lowcnts, dsLowCounts, lsgstListOfLists[-1]), 4)))\n",
    "print(\"LogL  ( MLGST )   = {0}\".format(round(pygsti.logl(mdl_mle_lowcnts, dsLowCounts, lsgstListOfLists[-1]), 4)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}