{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Using GST to test for context dependence\n",
    "This example shows how to introduce new operation labels into a GST analysis so as to test for context dependence.  In particular, we'll look at the 1-qubit X, Y, I model.  Suppose a usual GST analysis cannot fit the model well, and that we think this is due to the fact that a \"Gi\" gate which immediately follows a \"Gx\" gate is affected by some residual noise that isn't otherwise present.  In this case, we can model the system as having two different \"Gi\" gates: \"Gi\" and \"Gi2\", and model the \"Gi\" gate as \"Gi2\" whenever it follows a \"Gx\" gate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import print_function\n",
    "import pygsti\n",
    "from pygsti.modelpacks.legacy import std1Q_XYI"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we'll create a mock data set that exhibits this context dependence.  To do this, we add an additional \"Gi2\" gate to the data-generating model, generate some data using \"Gi2\"-containing operation sequences, and finally replace all instances of \"Gi2\" with \"Gi\" so that it looks like data that was supposed to have just a single \"Gi\" gate. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The usual setup: identify the target model, fiducials, germs, and max-lengths\n",
    "target_model = std1Q_XYI.target_model()\n",
    "fiducials = std1Q_XYI.fiducials\n",
    "germs = std1Q_XYI.germs\n",
    "maxLengths = [1,2,4,8,16,32]\n",
    "\n",
    "# Create a Model to generate the data - one that has two identity gates: Gi and Gi2\n",
    "mdl_datagen = target_model.depolarize(op_noise=0.1, spam_noise=0.001)\n",
    "mdl_datagen[\"Gi2\"] = mdl_datagen[\"Gi\"].copy()\n",
    "mdl_datagen[\"Gi2\"].depolarize(0.1) # depolarize Gi2 even further\n",
    "mdl_datagen[\"Gi2\"].rotate( (0,0,0.1), mdl_datagen.basis) # and rotate it slightly about the Z-axis\n",
    "\n",
    "# Create a set of operation sequences by constructing the usual set of experiments and using \n",
    "# \"manipulate_circuit_list\" to replace Gi with Gi2 whenever it follows Gx.  Create a \n",
    "# DataSet using the resulting Gi2-containing list of sequences.\n",
    "listOfExperiments = pygsti.construction.make_lsgst_experiment_list(target_model, fiducials, fiducials, germs, maxLengths)\n",
    "rules = [ ((\"Gx\",\"Gi\") , (\"Gx\",\"Gi2\")) ] # a single replacement rule: GxGi => GxGi2 \n",
    "listOfExperiments = pygsti.construction.manipulate_circuit_list(listOfExperiments, rules)\n",
    "ds = pygsti.construction.generate_fake_data(mdl_datagen, listOfExperiments, nSamples=10000,\n",
    "                                            sampleError=\"binomial\", seed=1234)\n",
    "\n",
    "# Revert all the Gi2 labels back to Gi, so that the DataSet doesn't contain any Gi2 labels.\n",
    "rev_rules = [ ((\"Gi2\",) , (\"Gi\",)) ] # returns all Gi2's to Gi's \n",
    "ds = ds.copy_nonstatic()\n",
    "ds.process_circuits(lambda opstr: pygsti.construction.manipulate_circuit(opstr,rev_rules))\n",
    "ds.done_adding_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Running \"standard\" GST on this `DataSet` resulst in a bad fit: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Circuit Creation ---\n",
      "   1702 sequences created\n",
      "   Dataset has 1702 entries: 1702 utilized, 0 requested sequences were missing\n",
      "--- LGST ---\n",
      "  Singular values of I_tilde (truncating to first 4 of 6) = \n",
      "  4.244188936768758\n",
      "  1.1682925525201766\n",
      "  0.9547999005720996\n",
      "  0.9492497151099545\n",
      "  0.021826565768099753\n",
      "  0.0037403199750632556\n",
      "  \n",
      "  Singular values of target I_tilde (truncating to first 4 of 6) = \n",
      "  4.242640687119284\n",
      "  1.4142135623730956\n",
      "  1.4142135623730954\n",
      "  1.4142135623730951\n",
      "  2.5394445830714747e-16\n",
      "  1.118988490269554e-16\n",
      "  \n",
      "--- Iterative MLGST: Iter 1 of 6  92 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 103.835 (92 data params - 31 model params = expected mean of 61; p-value = 0.000518234)\n",
      "  Completed in 0.4s\n",
      "  2*Delta(log(L)) = 103.629\n",
      "  Iteration 1 took 0.4s\n",
      "  \n",
      "--- Iterative MLGST: Iter 2 of 6  168 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 334.957 (168 data params - 31 model params = expected mean of 137; p-value = 0)\n",
      "  Completed in 0.3s\n",
      "  2*Delta(log(L)) = 333.153\n",
      "  Iteration 2 took 0.4s\n",
      "  \n",
      "--- Iterative MLGST: Iter 3 of 6  450 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 1453.91 (450 data params - 31 model params = expected mean of 419; p-value = 0)\n",
      "  Completed in 0.7s\n",
      "  2*Delta(log(L)) = 1443.6\n",
      "  Iteration 3 took 0.8s\n",
      "  \n",
      "--- Iterative MLGST: Iter 4 of 6  862 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 3044.92 (862 data params - 31 model params = expected mean of 831; p-value = 0)\n",
      "  Completed in 1.2s\n",
      "  2*Delta(log(L)) = 3025.62\n",
      "  Iteration 4 took 1.5s\n",
      "  \n",
      "--- Iterative MLGST: Iter 5 of 6  1282 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 4147.68 (1282 data params - 31 model params = expected mean of 1251; p-value = 0)\n",
      "  Completed in 1.7s\n",
      "  2*Delta(log(L)) = 4126.09\n",
      "  Iteration 5 took 2.0s\n",
      "  \n",
      "--- Iterative MLGST: Iter 6 of 6  1702 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 4664.28 (1702 data params - 31 model params = expected mean of 1671; p-value = 0)\n",
      "  Completed in 2.3s\n",
      "  2*Delta(log(L)) = 4642.36\n",
      "  Iteration 6 took 2.7s\n",
      "  \n",
      "  Switching to ML objective (last iteration)\n",
      "  --- MLGST ---\n",
      "    Maximum log(L) = 2320.85 below upper bound of -2.85002e+07\n",
      "      2*Delta(log(L)) = 4641.7 (1702 data params - 31 model params = expected mean of 1671; p-value = 0)\n",
      "    Completed in 1.2s\n",
      "  2*Delta(log(L)) = 4641.7\n",
      "  Final MLGST took 1.2s\n",
      "  \n",
      "Iterative MLGST Total Time: 9.0s\n"
     ]
    }
   ],
   "source": [
    "target_model.set_all_parameterizations(\"TP\")\n",
    "results = pygsti.do_long_sequence_gst(ds, target_model, fiducials, fiducials,\n",
    "                                      germs, maxLengths, verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, since we have a hunch that the reason for the bad fit is that when \"Gi\" follows \"Gx\" it looks different, we can fit that data to a model that has two identity gates, call them \"Gi\" and \"Gi2\" again, and tell GST to perform the \"GxGi => GxGi2\" manipulation rule before computing the probability of a gate sequence:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Circuit Creation ---\n",
      "   1738 sequences created\n",
      "   Dataset has 1702 entries: 1738 utilized, 0 requested sequences were missing\n",
      "--- LGST ---\n",
      "  Singular values of I_tilde (truncating to first 4 of 6) = \n",
      "  4.244188936768758\n",
      "  1.1682925525201766\n",
      "  0.9547999005720996\n",
      "  0.9492497151099545\n",
      "  0.021826565768099753\n",
      "  0.0037403199750632556\n",
      "  \n",
      "  Singular values of target I_tilde (truncating to first 4 of 6) = \n",
      "  4.242640687119284\n",
      "  1.4142135623730956\n",
      "  1.4142135623730954\n",
      "  1.4142135623730951\n",
      "  2.5394445830714747e-16\n",
      "  1.118988490269554e-16\n",
      "  \n",
      "--- Iterative MLGST: Iter 1 of 6  128 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 170.639 (128 data params - 43 model params = expected mean of 85; p-value = 1.07327e-07)\n",
      "  Completed in 0.6s\n",
      "  2*Delta(log(L)) = 170.262\n",
      "  Iteration 1 took 0.6s\n",
      "  \n",
      "--- Iterative MLGST: Iter 2 of 6  204 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 434.966 (204 data params - 43 model params = expected mean of 161; p-value = 0)\n",
      "  Completed in 0.4s\n",
      "  2*Delta(log(L)) = 432.7\n",
      "  Iteration 2 took 0.5s\n",
      "  \n",
      "--- Iterative MLGST: Iter 3 of 6  486 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 1210.84 (486 data params - 43 model params = expected mean of 443; p-value = 0)\n",
      "  Completed in 0.8s\n",
      "  2*Delta(log(L)) = 1200.98\n",
      "  Iteration 3 took 1.0s\n",
      "  \n",
      "--- Iterative MLGST: Iter 4 of 6  898 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 1804.6 (898 data params - 43 model params = expected mean of 855; p-value = 0)\n",
      "  Completed in 1.2s\n",
      "  2*Delta(log(L)) = 1796.46\n",
      "  Iteration 4 took 1.5s\n",
      "  \n",
      "--- Iterative MLGST: Iter 5 of 6  1318 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 2315.63 (1318 data params - 43 model params = expected mean of 1275; p-value = 0)\n",
      "  Completed in 2.0s\n",
      "  2*Delta(log(L)) = 2309.07\n",
      "  Iteration 5 took 2.4s\n",
      "  \n",
      "--- Iterative MLGST: Iter 6 of 6  1738 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 2741.25 (1738 data params - 43 model params = expected mean of 1695; p-value = 0)\n",
      "  Completed in 2.6s\n",
      "  2*Delta(log(L)) = 2734.87\n",
      "  Iteration 6 took 3.3s\n",
      "  \n",
      "  Switching to ML objective (last iteration)\n",
      "  --- MLGST ---\n",
      "    Maximum log(L) = 1366.9 below upper bound of -2.90842e+07\n",
      "      2*Delta(log(L)) = 2733.79 (1738 data params - 43 model params = expected mean of 1695; p-value = 0)\n",
      "    Completed in 2.1s\n",
      "  2*Delta(log(L)) = 2733.79\n",
      "  Final MLGST took 2.1s\n",
      "  \n",
      "Iterative MLGST Total Time: 11.3s\n"
     ]
    }
   ],
   "source": [
    "#Create a target model which includes a duplicate Gi called Gi2\n",
    "mdl_targetB = target_model.copy()\n",
    "mdl_targetB['Gi2'] = target_model['Gi'].copy() # Gi2 should just be another Gi\n",
    "\n",
    "#Run GST with:\n",
    "# 1) replacement rules giving instructions how to process all of the operation sequences\n",
    "# 2) data set aliases which replace labels in the *processed* strings before querying the DataSet.\n",
    "rules = [ ((\"Gx\",\"Gi\") , (\"Gx\",\"Gi2\")) ] # a single replacement rule: GxGi => GxGi2 \n",
    "resultsB = pygsti.do_long_sequence_gst(ds, mdl_targetB, fiducials, fiducials,\n",
    "                                       germs, maxLengths, \n",
    "                                       advancedOptions={\"opLabelAliases\": {'Gi2': pygsti.obj.Circuit(['Gi'])},\n",
    "                                                        \"stringManipRules\": rules},\n",
    "                                       verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This gives a better fit, but not as good as it should (given that we know the data was generated from exactly the model being used).  This is due to the (default) LGST seed being a bad starting point, which can happen, particularly when looking for context dependence.  (The LGST estimate - which you can print using `print(resultsB.estimates['default'].models['seed'])` - generates the *same* estimate for Gi and Gi2 which is roughly between the true values of Gi and Gi2, which can be a bad estimate for both gates.)  To instead use our own custom guess as the starting point, we do the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "--- Circuit Creation ---\n",
      "   1738 sequences created\n",
      "   Dataset has 1702 entries: 1738 utilized, 0 requested sequences were missing\n",
      "--- Iterative MLGST: Iter 1 of 6  128 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 170.639 (128 data params - 43 model params = expected mean of 85; p-value = 1.07324e-07)\n",
      "  Completed in 0.6s\n",
      "  2*Delta(log(L)) = 170.262\n",
      "  Iteration 1 took 0.6s\n",
      "  \n",
      "--- Iterative MLGST: Iter 2 of 6  204 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 434.966 (204 data params - 43 model params = expected mean of 161; p-value = 0)\n",
      "  Completed in 0.4s\n",
      "  2*Delta(log(L)) = 432.7\n",
      "  Iteration 2 took 0.4s\n",
      "  \n",
      "--- Iterative MLGST: Iter 3 of 6  486 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 1210.84 (486 data params - 43 model params = expected mean of 443; p-value = 0)\n",
      "  Completed in 0.8s\n",
      "  2*Delta(log(L)) = 1200.98\n",
      "  Iteration 3 took 1.0s\n",
      "  \n",
      "--- Iterative MLGST: Iter 4 of 6  898 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 1804.6 (898 data params - 43 model params = expected mean of 855; p-value = 0)\n",
      "  Completed in 1.2s\n",
      "  2*Delta(log(L)) = 1796.46\n",
      "  Iteration 4 took 1.5s\n",
      "  \n",
      "--- Iterative MLGST: Iter 5 of 6  1318 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 2315.63 (1318 data params - 43 model params = expected mean of 1275; p-value = 0)\n",
      "  Completed in 2.0s\n",
      "  2*Delta(log(L)) = 2309.07\n",
      "  Iteration 5 took 2.4s\n",
      "  \n",
      "--- Iterative MLGST: Iter 6 of 6  1738 operation sequences ---: \n",
      "  --- Minimum Chi^2 GST ---\n",
      "  Sum of Chi^2 = 2741.25 (1738 data params - 43 model params = expected mean of 1695; p-value = 0)\n",
      "  Completed in 2.6s\n",
      "  2*Delta(log(L)) = 2734.87\n",
      "  Iteration 6 took 3.3s\n",
      "  \n",
      "  Switching to ML objective (last iteration)\n",
      "  --- MLGST ---\n",
      "    Maximum log(L) = 1366.9 below upper bound of -2.90842e+07\n",
      "      2*Delta(log(L)) = 2733.79 (1738 data params - 43 model params = expected mean of 1695; p-value = 0)\n",
      "    Completed in 2.0s\n",
      "  2*Delta(log(L)) = 2733.79\n",
      "  Final MLGST took 2.0s\n",
      "  \n",
      "Iterative MLGST Total Time: 11.3s\n"
     ]
    }
   ],
   "source": [
    "#Create a guess, which we'll use instead of LGST - here we just\n",
    "# take a slightly depolarized target.\n",
    "mdl_start = mdl_targetB.depolarize(op_noise=0.01, spam_noise=0.01)\n",
    "\n",
    "#Run GST with the replacement rules as before.\n",
    "resultsC = pygsti.do_long_sequence_gst(ds, mdl_targetB, fiducials, fiducials,\n",
    "                                       germs, maxLengths, \n",
    "                                       advancedOptions={\"opLabelAliases\": {'Gi2': pygsti.obj.Circuit(['Gi'])},\n",
    "                                                        \"stringManipRules\": rules,\n",
    "                                                        \"starting point\": mdl_start},\n",
    "                                       verbosity=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This results is a much better fit and estimate, as seen from the final `2*Delta(log(L))` number."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Diff between truth and standard GST:  0.0195517914063444\n",
      "Diff between truth and context-dep GST w/LGST starting pt:  0.005951917086817148\n",
      "Diff between truth and context-dep GST w/custom starting pt:  0.005951941892478445\n"
     ]
    }
   ],
   "source": [
    "gsA = pygsti.gaugeopt_to_target(results.estimates['default'].models['final iteration estimate'], mdl_datagen)\n",
    "gsB = pygsti.gaugeopt_to_target(resultsB.estimates['default'].models['final iteration estimate'], mdl_datagen)\n",
    "gsC = pygsti.gaugeopt_to_target(resultsC.estimates['default'].models['final iteration estimate'], mdl_datagen)\n",
    "gsA['Gi2'] = gsA['Gi'] #so gsA is comparable with mdl_datagen\n",
    "print(\"Diff between truth and standard GST: \", mdl_datagen.frobeniusdist(gsA))\n",
    "print(\"Diff between truth and context-dep GST w/LGST starting pt: \", mdl_datagen.frobeniusdist(gsB))\n",
    "print(\"Diff between truth and context-dep GST w/custom starting pt: \", mdl_datagen.frobeniusdist(gsC))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}