{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The `MultiDataSet` object: a dictionary of `DataSet`s\n", "\n", "Sometimes it is useful to deal with several sets of data all of which hold counts for the *same* set of operation sequences. For example, colleting data to perform GST on Monday and then again on Tuesday, or making an adjustment to an experimental system and re-taking data, could create two separate data sets with the same sequences. PyGSTi has a separate data type, `pygsti.objects.MultiDataSet`, for this purpose. A `MultiDataSet` looks and acts like a simple dictionary of `DataSet` objects, but underneath implements some certain optimizations that reduce the amount of space and memory required to store the data. Primarily, it holds just a *single* list of the circuits - as opposed to an actual dictionary of `DataSet`s in which each `DataSet` contains it's own copy of the circuits. In addition to being more space efficient, a `MultiDataSet` is able to aggregate all of its data into a single \"summed\" `DataSet` via `get_datasets_aggregate(...)`, which can be useful for combining several \"passes\" of experimental data. \n", "\n", "Several remarks regarding a `MultiDataSet` are worth mentioning:\n", "- you add `DataSets` to a `MultiDataSet` using the `add_dataset` method. However only *static* `DataSet` objects can be added. This is because the MultiDataSet must keep all of its `DataSet`s locked to the same set of sequences, and a non-static `DataSet` allows the addition or removal of only *its* sequences. (If the `DataSet` you want to add isn't in static-mode, call its `done_adding_data` method.)\n", "- square-bracket indexing accesses the `MultiDataSet` as if it were a dictionary of `DataSets`.\n", "- `MultiDataSets` can be loaded and saved from a single text-format file with columns for each contained `DataSet` - see `pygsti.io.load_multidataset`.\n", "\n", "Here's a brief example of using a `MultiDataSet`:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MultiDataSet has 2 operation sequences and DataSet labels ['myDS', 'myDS2']\n", "Empty string data for myDS = {('0',): 10.0, ('1',): 90.0}\n", "Empty string data for myDS2 = {('0',): 15.0, ('1',): 85.0}\n", "Gx string data (no label) = {('0',): 10.0, ('1',): 90.0}\n", "Gx string data (no label) = {('0',): 5.0, ('1',): 95.0}\n", "GxGy string data for myDS = {('0',): 20.0, ('1',): 80.0}\n", "GxGy string data for myDS2 = {('0',): 30.0, ('1',): 70.0}\n", "\n", "Summed data:\n", "{} : {('0',): 25.0, ('1',): 175.0}\n", "Gx : {('0',): 15.0, ('1',): 185.0}\n", "GxGy : {('0',): 50.0, ('1',): 150.0}\n", "GxGxGxGx : {('0',): 60.0, ('1',): 140.0}\n", "\n", "\n" ] } ], "source": [ "from __future__ import print_function\n", "import pygsti\n", "\n", "multiDS = pygsti.objects.MultiDataSet()\n", "\n", "#Create some datasets \n", "ds = pygsti.objects.DataSet(outcomeLabels=['0','1'])\n", "ds.add_count_dict( (), {'0': 10, '1': 90} )\n", "ds.add_count_dict( ('Gx',), {'0': 10, '1': 90} )\n", "ds.add_count_dict( ('Gx','Gy'), {'0': 20, '1': 80} )\n", "ds.add_count_dict( ('Gx','Gx','Gx','Gx'), {'0': 20, '1': 80} )\n", "ds.done_adding_data()\n", "\n", "ds2 = pygsti.objects.DataSet(outcomeLabels=['0','1']) \n", "ds2.add_count_dict( (), {'0': 15, '1': 85} )\n", "ds2.add_count_dict( ('Gx',), {'0': 5, '1': 95} )\n", "ds2.add_count_dict( ('Gx','Gy'), {'0': 30, '1': 70} )\n", "ds2.add_count_dict( ('Gx','Gx','Gx','Gx'), {'0': 40, '1': 60} )\n", "ds2.done_adding_data()\n", "\n", "multiDS['myDS'] = ds\n", "multiDS['myDS2'] = ds2\n", "\n", "nStrs = len(multiDS)\n", "dslabels = list(multiDS.keys())\n", "print(\"MultiDataSet has %d operation sequences and DataSet labels %s\" % (nStrs, dslabels))\n", " \n", "for dslabel in multiDS:\n", " ds = multiDS[dslabel]\n", " print(\"Empty string data for %s = \" % dslabel, ds[()]) \n", "\n", "for ds in multiDS.values():\n", " print(\"Gx string data (no label) =\", ds[('Gx',)]) \n", "\n", "for dslabel,ds in multiDS.items():\n", " print(\"GxGy string data for %s =\" % dslabel, ds[('Gx','Gy')]) \n", "\n", "dsSum = multiDS.get_datasets_aggregate('myDS','myDS2')\n", "print(\"\\nSummed data:\")\n", "print(dsSum)\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loading ../../tutorial_files/TinyMultiDataset.txt: 100%\n", "\n", "Loaded from file:\n", "\n", "MultiDataSet containing: 2 datasets, each with 4 strings\n", " Dataset names = DS0, DS1\n", " Outcome labels = ('0',), ('1',)\n", "Gate strings: \n", "\n", "Qubit * ---|Gx|---\n", "\n", "Qubit * ---|Gx|-|Gy|---\n", "\n", "Qubit * ---|Gx|-|Gx|-|Gx|-|Gx|---\n", "\n", "\n" ] } ], "source": [ "multi_dataset_txt = \\\n", "\"\"\"## Columns = DS0 0 count, DS0 1 count, DS1 0 frequency, DS1 count total \n", "{} 0 100 0 100 \n", "Gx 10 90 0.1 100 \n", "GxGy 40 60 0.4 100 \n", "Gx^4 20 80 0.2 100 \n", "\"\"\"\n", "\n", "with open(\"../../tutorial_files/TinyMultiDataset.txt\",\"w\") as output:\n", " output.write(multi_dataset_txt)\n", "multiDS_fromFile = pygsti.io.load_multidataset(\"../../tutorial_files/TinyMultiDataset.txt\", cache=False)\n", "\n", "print(\"\\nLoaded from file:\\n\")\n", "print(multiDS_fromFile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Those are the basics of using `MultiDataSet`. More information is available in the docstrings for the various `MultiDataSet` methods." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.0" } }, "nbformat": 4, "nbformat_minor": 1 }