{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Algorithms: low-level interface\n",
    "Once we have data for GST, there are several *algorithms* we can run on it to produce tomographic estimates. Depending on the amount of data you have, and time available for running LinearOperator Set Tomography, one algorithm may be preferable over the others.  **What is typically thought of as \"standard GST\" is the iterative maximum-likelihood optimization implemented by `run_iterative_gst`** which optimizes one or more objective functions iteratively, meaning it daisy-chains multiple optimizations, changing (typically *adding* to) the circuits under consideration or the objective function itself (e.g. from $\\chi^2$ to the log-likelihood).\n",
    "\n",
    "`pygsti` contains two main low-level algorithms that form the base of the GST protocol(s):\n",
    "\n",
    "* **Linear gate set tomography (LGST)**: Uses short operation sequences to quickly compute a rough (low accuracy) estimate of a model by linear inversion.  The model estimate produced has a very specific form: it holds the dense process matrices of each \"gate\" (or circuit layer) under consideration.\n",
    "\n",
    "* **Long-sequence gate set tomography (LSGST)**: Minimizes an objective function that quantifies a badness-of-fit between a model and data set.  Here \"sequence\" is synonymous with \"circuit\".  The optimization is performed in stages, using successively larger sets of longer and longer circuits.\n",
    "\n",
    "If you're curious, the implementation of these algorithms are found in the `pygsti.algorithms.core` module.  In this tutorial, we'll show how to invoke each of these low-level algorithms directly.  This is probably only useful when building your own protocol that needs to borrow pieces of GST without borrowing the entire protocol.\n",
    "\n",
    "Additionally, `pygsti` contains **gauge-optimization** algorithms.  Because the outcome data (the input to the GST algorithms above) only determines a model up to some number of un-physical \"gauge\" degrees of freedom, it is often desirable to optimize the `Model` estimate obtained from a GST algorithm within the space of its gauge freedoms.  This process is called \"gauge-optimization\" and the final part of this tutorial demonstrates how to gauge-optimize a model using various criteria."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Low-level GST Algorithms\n",
    "The ingredients needed by the LGST and LSGST algorithms are:\n",
    "- a \"target\" `Model` which defines the desired gates.  This model is used by LGST to specify the various gate, state preparation, POVM effect, and SPAM labels, as well as to provide an initial guess for the *gauge* degrees of freedom.\n",
    "- a `DataSet` containing the data that GST attempts to fit using the probabilities generated by a single `Model`.  This data set must at least contain the data for the circuits required by the algorithm that is chosen.\n",
    "- for LSGST, a list-of-lists of `Circuit` objects that specify which circuits are used during each iteration of the algorithm (the length of the top-level list defines the number of interations)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pygsti\n",
    "import json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Reading ../../tutorial_files/Example_Dataset_LowCnts.txt: 100%\n",
      "Writing cache file (to speed future loads): ../../tutorial_files/Example_Dataset_LowCnts.txt.cache\n",
      "Loaded target model with operation labels:  odict_keys([Label(()), Label(('Gxpi2', 0)), Label(('Gypi2', 0))])\n",
      "Loaded fiducial lists of lengths:  (6, 6)\n",
      "Loaded dataset of length:  1120\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\ciostro\\Documents\\pyGSTi_random_bugfixes\\pygsti\\tools\\legacytools.py:38: UserWarning: The function save is deprecated, and may not be present in future versions of pygsti.\n",
      "    Please use write_binary instead.\n",
      "  _warnings.warn(message)\n"
     ]
    }
   ],
   "source": [
    "#We'll use the standard I, X(pi/2), Y(pi/2) model that we generated data for in the DataSet tutorial\n",
    "from pygsti.modelpacks import smq1Q_XYI\n",
    "\n",
    "target_model = smq1Q_XYI.target_model()\n",
    "prep_fiducials = smq1Q_XYI.prep_fiducials()\n",
    "meas_fiducials = smq1Q_XYI.meas_fiducials()\n",
    "germs = smq1Q_XYI.germs()\n",
    "\n",
    "maxLengthList = [1,2,4,8,16] #for use in iterative algorithms\n",
    "\n",
    "ds = pygsti.io.load_dataset(\"../../tutorial_files/Example_Dataset.txt\", cache=True)\n",
    "dsLowCounts = pygsti.io.load_dataset(\"../../tutorial_files/Example_Dataset_LowCnts.txt\", cache=True)\n",
    "\n",
    "depol_model = target_model.depolarize(op_noise=0.1)\n",
    "\n",
    "print(\"Loaded target model with operation labels: \", target_model.operations.keys())\n",
    "print(\"Loaded fiducial lists of lengths: \", (len(prep_fiducials),len(meas_fiducials)))\n",
    "print(\"Loaded dataset of length: \", len(ds))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using LGST to get an initial estimate\n",
    "\n",
    "An important and distinguising property of the LGST algorithm is that it does *not* require an initial-guess `Model` as an input.  It uses linear inversion and short sequences to obtain a rough model estimate.  As such, it is very common to use the LGST estimate as the initial-guess starting point for more advanced forms of GST."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- LGST ---\n",
      "Gauge optimization completed in 0.0500555s.\n",
      "--- Contract to CPTP (direct) ---\n",
      "The closest legal point found was distance: 0.0002851261395880569\n"
     ]
    }
   ],
   "source": [
    "#Run LGST to get an initial estimate for the gates in target_model based on the data in ds\n",
    "\n",
    "#run LGST\n",
    "mdl_lgst = pygsti.run_lgst(ds, prep_fiducials, meas_fiducials, target_model=target_model, verbosity=1)\n",
    "\n",
    "#Gauge optimize the result to match the target model\n",
    "mdl_lgst_after_gauge_opt = pygsti.gaugeopt_to_target(mdl_lgst, target_model, verbosity=1)\n",
    "\n",
    "#Contract the result to CPTP, guaranteeing that the gates are CPTP\n",
    "mdl_clgst = pygsti.contract(mdl_lgst_after_gauge_opt, \"CPTP\", verbosity=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "rho0 = FullState with dimension 4\n",
      " 0.71-0.02 0.03 0.75\n",
      "\n",
      "\n",
      "Mdefault = UnconstrainedPOVM with effect vectors:\n",
      "0: FullPOVMEffect with dimension 4\n",
      " 0.73   0   0 0.65\n",
      "\n",
      "1: FullPOVMEffect with dimension 4\n",
      " 0.69   0   0-0.65\n",
      "\n",
      "\n",
      "\n",
      "[] = \n",
      "FullArbitraryOp with shape (4, 4)\n",
      " 1.00   0   0   0\n",
      " 0.02 0.88 0.03-0.05\n",
      "-0.03-0.03 0.88 0.02\n",
      "   0   0   0 0.90\n",
      "\n",
      "\n",
      "Gxpi2:0 = \n",
      "FullArbitraryOp with shape (4, 4)\n",
      " 1.00   0   0   0\n",
      "   0 0.90 0.03   0\n",
      "-0.02-0.03-0.03-0.99\n",
      "-0.06 0.03 0.81   0\n",
      "\n",
      "\n",
      "Gypi2:0 = \n",
      "FullArbitraryOp with shape (4, 4)\n",
      " 1.00   0   0   0\n",
      "   0   0-0.05 1.00\n",
      " 0.02   0 0.89   0\n",
      "-0.05-0.80   0   0\n",
      "\n",
      "\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(mdl_lgst)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Long-sequence GST\n",
    "Long-sequence GST operates by repeatedly fitting a model to a data set and using the result to seed a similar optimization that compares the data of more circuits.  As such a list-of-lists of circuits, one per iteration, is required.  The elements of these lists are typically repetitions of short \"germ\" circuits such that the final circuit does not exceed some maximum length (depth).  We use the `make_lsgst_lists` function to create such lists below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Get lists of operation sequences for successive iterations of LSGST to use\n",
    "lsgstListOfLists = pygsti.circuits.create_lsgst_circuit_lists(target_model, prep_fiducials, meas_fiducials, germs, maxLengthList)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A single iteration, whereby the parameters of a `Model` are adjusted to find the best fit to a `DataSet`, is performed by the `run_gst_fit` function.  The \"best fit\" is defined as the model parameters which minimize some objective function.  `run_gst_fit` takes as arguments an a `ModelDatasetCircuitStore`, which simply bundles together a `Model`, `Dataset`, and set of `Circuits`, as well as an objective function builder (an object that encapsulates an objective function but lacks some needed context, and can *build* an objective function given this context). For convience, the `run_gst_fit_simple` function abstracts away the `ModelDatasetCircuitStore` construction.\n",
    "\n",
    "For example, the cell below finds the model parameters which minimize the log-likelihood computed over the first list of circuits. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "   1 atoms, parameter block size limits (None,)\n",
      "*** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "    More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "    Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "*** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "--- dlogl GST ---\n",
      "2*Delta(log(L)) = 48.8152 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0288947)\n",
      "Completed in 0.2s\n"
     ]
    }
   ],
   "source": [
    "opt_result, mdl_single_optimization = \\\n",
    "    pygsti.algorithms.run_gst_fit_simple(ds, mdl_clgst, lsgstListOfLists[0],\n",
    "                                         optimizer=None, objective_function_builder=\"logl\",\n",
    "                                         resource_alloc=None, verbosity=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The iterative algorithm, which essentially runs `run_gst_fit` multiple times, is implemented by `run_iterative_gst`.  The number of iterations is set by the number of circuit lists provided as the 3rd argument.  It takes two lists of objective function builders: the first, `iteration_objfn_builders` gives the objective functions to (sequentially) optimize on each iteration, while the second, `final_objfn_builders` gives additional objective functions to optimize on the final iteration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Iterative GST: Iter 1 of 5  92 circuits ---: \n",
      "  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "     1 atoms, parameter block size limits (None,)\n",
      "  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "      More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "  --- chi2 GST ---\n",
      "  Sum of Chi^2 = 48.6364 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0300305)\n",
      "  Completed in 0.1s\n",
      "  Iteration 1 took 0.1s\n",
      "  \n",
      "--- Iterative GST: Iter 2 of 5  168 circuits ---: \n",
      "  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "     1 atoms, parameter block size limits (None,)\n",
      "  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "      More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "  --- chi2 GST ---\n",
      "  Sum of Chi^2 = 123.32 (168 data params - 60 (approx) model params = expected mean of 108; p-value = 0.14878)\n",
      "  Completed in 0.2s\n",
      "  Iteration 2 took 0.3s\n",
      "  \n",
      "--- Iterative GST: Iter 3 of 5  285 circuits ---: \n",
      "  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "     1 atoms, parameter block size limits (None,)\n",
      "  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "      More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "  --- chi2 GST ---\n",
      "  Sum of Chi^2 = 232.735 (285 data params - 60 (approx) model params = expected mean of 225; p-value = 0.347582)\n",
      "  Completed in 0.2s\n",
      "  Iteration 3 took 0.3s\n",
      "  \n",
      "--- Iterative GST: Iter 4 of 5  448 circuits ---: \n",
      "  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "     1 atoms, parameter block size limits (None,)\n",
      "  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "      More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "  --- chi2 GST ---\n",
      "  Sum of Chi^2 = 423.121 (448 data params - 60 (approx) model params = expected mean of 388; p-value = 0.105949)\n",
      "  Completed in 0.4s\n",
      "  Iteration 4 took 0.5s\n",
      "  \n",
      "--- Iterative GST: Iter 5 of 5  616 circuits ---: \n",
      "  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.\n",
      "     1 atoms, parameter block size limits (None,)\n",
      "  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***\n",
      "      More atom-processors than hosts: each host gets ~1 atom-processors\n",
      "      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.\n",
      "  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***\n",
      "  --- chi2 GST ---\n",
      "  Sum of Chi^2 = 583.022 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.20683)\n",
      "  Completed in 0.5s\n",
      "  Iteration 5 took 0.7s\n",
      "  \n",
      "  Last iteration:\n",
      "  --- dlogl GST ---\n",
      "  2*Delta(log(L)) = 583.536 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.202591)\n",
      "  Completed in 0.7s\n",
      "  Final optimization took 0.7s\n",
      "  \n",
      "Iterative GST Total Time: 2.7s\n"
     ]
    }
   ],
   "source": [
    "models, opt_results, cache = \\\n",
    "    pygsti.algorithms.run_iterative_gst(ds, mdl_clgst, lsgstListOfLists,\n",
    "                                       optimizer={'tol': 1e-5}, resource_alloc=None, verbosity=2,\n",
    "                                       iteration_objfn_builders=['chi2'],\n",
    "                                       final_objfn_builders=['logl'])\n",
    "mdl_lsgst = models[-1]  # the model from the final iteration, i.e. the \"final GST model\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Gauge Optimization\n",
    "All gauge optimization algorithms perform essentially the same task - to find the model which optimizes some objective function from within the set or space of models that are gauge-equivalent to some starting set.  This is accomplished in `pygsti` using the following mechanism:\n",
    "- one begins with an initial `ExplicitOpModel`, call it `mdl`, to be gauge-optimized.\n",
    "- a `pygsti.objects.GaugeGroup` instance defines a parameterized group of allowable gauge transformations.  This gauge group must be compatible with the `mdl`'s parameterization, so that `mdl.transform_inplace` (which calls `LinearOperator.transform_inplace` and `SPAMVec.transform_inplace`) is able to process elements of the `GaugeGroup` (obtained via a call to `GaugeGroup.element(params)`).  That is, the gauge transformation must map between models with the *same* parameterization (that give by `mdl`).  Because of the close interplay between a model's parameterization and its allowed gauge transformations, `Model` objects can contain a `GaugeGroup` instance as their `default_gauge_group` member. In many circumstances, `mdl.default_gauge_group` is set to the correct gauge group to use for a given `Model`.\n",
    "- `pygsti.gaugeopt_custom(...)` takes an intial `ExplicitOpModel`, an objective function, and a `GaugeGroup` (along with other optimization parameters) and returns a gauge-optimized `ExplicitOpModel`.  Note that if its `gauge_group` argument is left as `None`, then the model's default gauge group is used.  And objective function which takes a single `ExplicitOpModel` argument and returns a float can be supplied, giving the user a fair amount of flexiblity.\n",
    "- since usually the objective function is one which compares the model being optimized to a fixed \"target\" model, `pygsti.gaugeopt_to_target(...)` is a routine able to perform these common types of gauge optimization.  Instead of an objective function, `gaugeopt_to_target` takes a target `ExplicitOpModel` and additional arguments (see below) from which it constructs a objective function and then calls `gaugeopt_custom`.  It is essetially a convenience routine for constructing common gauge optimization objective functions.  Relevant arguments which affect what objective function is used are:\n",
    "  - `target_model` : the `ExplicitOpModel` to compare against - i.e., the one you want to gauge optimize toward. **Note that this doesn't have to be a set of ideal gates **- it can be any (imperfect) model that reflects your expectations about what the estimates should look like.\n",
    "  - `item_weights` : a dictionary of weights allowing different gates and/or SPAM operations to be weighted differently when computing the objective function's value.\n",
    "  - `CPpenalty` : a prefactor multiplying the sum of all the negative Choi-matrix eigenvalues corresponding to each of the gates.\n",
    "  - `TPpenalty` : a prefactor multiplying the sum of absoulte-value differences between the first row of each operation matrix and `[1 0 ... 0 ]` and the discrpance between the first element of each state preparation vector and its expected value.\n",
    "  - `validSpamPenalty` : a prefactor multiplying penalty terms enforcing the non-negativity of state preparation eigenavlues and that POVM effect eigenvalues lie between 0 and 1.\n",
    "  - `gates_metric` : how to compare corresponding gates in the gauge-optimized and target sets. `\"frobenius`\" uses the frobenius norm (weighted before taking the final sqrt), `\"fidelity\"` uses the *squared process infidelity* (squared to avoid negative-infidelity issues in non-TP models), and `\"tracedist\"` uses the trace distance (weighted after computing the trace distance between corresponding gates).\n",
    "  - `spamMetric` : how to compare corresponding SPAM vectors. `\"frobenius\"` (the default) should be used here, as `\"fidelity\"` and `\"tracedist\"` compare the \"SPAM gates\" -- the outer product of state prep and POVM effect vectors -- which isn't a meaningful metric.\n",
    "  \n",
    "The cell below demonstrates some of common usages of `gaugeopt_to_target`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Default gauge group =  <class 'pygsti.models.gaugegroup.FullGaugeGroup'>\n",
      "\n",
      "gaugeopt_to_target output:\n",
      "  --- Gauge Optimization (ls method) ---\n",
      "--- Outer Iter 0: norm_f = 0.0922348, mu=1, |x|=2, |J|=2.69209\n",
      "--- Outer Iter 1: norm_f = 0.0922334, mu=0.00171089, |x|=2.00005, |J|=2.69205\n",
      "--- Outer Iter 2: norm_f = 0.0922334, mu=0.000513266, |x|=2.00004, |J|=2.69205\n",
      "  Least squares message = Both actual and predicted relative reductions in the sum of squares are at most 1e-08\n",
      "Gauge optimization completed in 0.0521772s.\n",
      "Final frobenius distance between mdl_go7 and target_model =  0.03920744104767387\n"
     ]
    }
   ],
   "source": [
    "mdl = mdl_lsgst.copy() #we'll use the MLGST result from above as an example\n",
    "mdl_go1 = pygsti.gaugeopt_to_target(mdl, target_model) # optimization to the perfect target gates\n",
    "mdl_go2 = pygsti.gaugeopt_to_target(mdl, depol_model) # optimization to a \"guess\" at what the estimate should be\n",
    "mdl_go3 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01}) \n",
    "  # weight the gates differently from the SPAM operations\n",
    "mdl_go4 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01, 'Gx': 10.0, 'E0': 0.001}) \n",
    "  # weight an individual gate/SPAM separately (note the specific gate/SPAM labels always override\n",
    "  # the more general 'gates' and 'spam' weight values). \n",
    "mdl_go5 = pygsti.gaugeopt_to_target(mdl, target_model, gates_metric=\"tracedist\") #use trace distance instead of frobenius\n",
    "\n",
    "from pygsti.models.gaugegroup import UnitaryGaugeGroup\n",
    "print(\"Default gauge group = \",type(mdl.default_gauge_group)) # default is FullGaugeGroup\n",
    "mdl_go6 = pygsti.gaugeopt_to_target(mdl, target_model, gauge_group=UnitaryGaugeGroup(1, 'pp'))\n",
    "  #gauge optimize only over unitary gauge transformations\n",
    "\n",
    "print(\"\\ngaugeopt_to_target output:\")\n",
    "mdl_go7 = pygsti.gaugeopt_to_target(mdl, target_model, verbosity=3) # show output\n",
    "print(\"Final frobenius distance between mdl_go7 and target_model = \", mdl_go7.frobeniusdist(target_model))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}