{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true, "deletable": true, "editable": true }, "source": [ "# Pure Data Analysis\n", "\n", "This tutorial covers different methods of analysing data *without* running GST. So far, there's only one, which checks for consistency between two (or more) datasets, called \"Data Set Comparison\".\n", "\n", "## Data Set Comparison\n", "This method declares that two or more `DataSet`s are \"consistent\" if the observed counts for the same gate strings across the data sets are all consistent with being generated by the same underlying gateset. This protocol can be used to test for, among other things, drift and crosstalk. It can also be used\n", "to compare an experimental dataset to an \"ideal\" dataset." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/enielse/research/pyGSTi/packages/pygsti/tools/matrixtools.py:23: UserWarning: Could not import Cython extension - falling back to slower pure-python routines\n", " _warnings.warn(\"Could not import Cython extension - falling back to slower pure-python routines\")\n" ] } ], "source": [ "from __future__ import division, print_function\n", "\n", "import pygsti\n", "import numpy as np\n", "import scipy\n", "from scipy import stats\n", "from pygsti.construction import std1Q_XYI" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "deletable": true, "editable": true }, "source": [ "Let's first compare two `Dataset` objects where the underlying gate sets are the same. The data sets we'll use will be GST datasets (which allows us to do some nice visualization), but arbitrary datasets will work in general, provided that the gate sequences across the datasets are the same." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "#Let's make our underlying gate set have a little bit of random unitary noise.\n", "gs_exp_0 = std1Q_XYI.gs_target.copy()\n", "gs_exp_0 = gs_exp_0.randomize_with_unitary(.01,seed=0)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "germs = std1Q_XYI.germs\n", "fiducials = std1Q_XYI.fiducials\n", "max_lengths = [1,2,4,8,16,32,64,128,256]\n", "gate_sequences = pygsti.construction.make_lsgst_experiment_list(std1Q_XYI.gates,fiducials,fiducials,germs,max_lengths)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "#Generate the data for the two datasets, using the same gate set, with 100 repetitions of each sequence.\n", "N=100\n", "DS_0 = pygsti.construction.generate_fake_data(gs_exp_0,gate_sequences,N,'binomial',seed=10)\n", "DS_1 = pygsti.construction.generate_fake_data(gs_exp_0,gate_sequences,N,'binomial',seed=20)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "#Let's compare the two datasets.\n", "comparator_0_1 = pygsti.objects.DataComparator([DS_0,DS_1])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Consistency report- datasets are inconsistent at given confidence level if EITHER of the following scores report inconsistency.\n", "\n", "Threshold for individual gatestring scores is 19.8354568601\n", "As measured by worst-performing gate strings, data sets are CONSISTENT at the 95.0% confidence level.\n", "0 gate string(s) have loglikelihood scores greater than the threshold.\n", "\n", "Threshold for sum of gatestring scores is 3114.73885644.\n", "As measured by sum of gatestring scores, data sets are CONSISTENT at the 95.0% confidence level.\n", "Total loglikelihood is 2720.86373067\n", "Total number of standard deviations (N_sigma) of model violation is -3.13296117274.\n" ] } ], "source": [ "#Let's get the report from the comparator.\n", "comparator_0_1.report(confidence_level=0.95)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
Loading...
\n", "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#Create a workspace to show plots\n", "w = pygsti.report.Workspace()\n", "w.init_notebook_mode(connected=False, autodisplay=True) " ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
\n", "
\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#As we expect, the datasets are consistent!\n", "#We can also visualize this in a few ways:\n", "\n", "#This is will show a histogram of the p-values associated with the different strings.\n", "#If the null hypothesis (that the underlying gate sets are the same) is true,\n", "#then we expect the distribution to roughly follow the dotted green line.\n", "w.DatasetComparisonHistogramPlot(comparator_0_1, log=True, display='pvalue')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
\n", "
\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Color box plot comparing two datasets from same gateset\n", "gssList = pygsti.construction.make_lsgst_structs(std1Q_XYI.gates, fiducials, fiducials, germs, max_lengths)\n", "w.ColorBoxPlot('dscmp', gssList[-1], None, None, dscomparator=comparator_0_1)\n", " #A lack of green boxes indicates consistency between datasets!" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "#Now let's generate data from two similar but not identical datasets and see if our tests can detect them." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "gs_exp_1 = std1Q_XYI.gs_target.copy()\n", "gs_exp_1 = gs_exp_1.randomize_with_unitary(.01,seed=1)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "DS_2 = pygsti.construction.generate_fake_data(gs_exp_1,gate_sequences,N,'binomial',seed=30)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Consistency report- datasets are inconsistent at given confidence level if EITHER of the following scores report inconsistency.\n", "\n", "Threshold for individual gatestring scores is 19.8354568601\n", "As measured by worst-performing gate strings, data sets are INCONSISTENT at the 95.0% confidence level.\n", "690 gate string(s) have loglikelihood scores greater than the threshold.\n", "\n", "Threshold for sum of gatestring scores is 3114.73885644.\n", "As measured by sum of gatestring scores, data sets are INCONSISTENT at the 95.0% confidence level.\n", "Total loglikelihood is 51382.5156194\n", "Total number of standard deviations (N_sigma) of model violation is 629.103186436.\n" ] } ], "source": [ "#Let's make the comparator and get the report.\n", "comparator_1_2 = pygsti.objects.DataComparator([DS_1,DS_2])\n", "comparator_1_2.report(confidence_level=0.95)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
\n", "
\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#The datasets are significantly inconsistent! Let's see what the distribution of p-values looks like now:\n", "w.DatasetComparisonHistogramPlot(comparator_1_2)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "deletable": true, "editable": true, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", "
\n", "
\n", "\n", "
\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "w.ColorBoxPlot('dscmp', gssList[-1], None, None, dscomparator=comparator_1_2)\n", "\n", "#comparator_1_2.box_plot(germs,fiducials,fiducials,max_lengths,.05,0,'Color box plot comparing two datasets from same gateset')\n", "#The red boxes indicate inconsistency between datasets!" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [ "#While we only look at gate sets with Markovian, unitary errors here, this protocol can also be used when the \n", "#error is neither unitary nor Markovian." ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }