{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The `MultiDataSet` object: a dictionary of `DataSet`s\n",
    "\n",
    "Sometimes it is useful to deal with several sets of data all of which hold counts for the *same* set of operation sequences.  For example, colleting data to perform GST on Monday and then again on Tuesday, or making an adjustment to an experimental system and re-taking data, could create two separate data sets with the same sequences.  PyGSTi has a separate data type, `pygsti.objects.MultiDataSet`, for this purpose.  A `MultiDataSet` looks and acts like a simple dictionary of `DataSet` objects, but underneath implements some certain optimizations that reduce the amount of space and memory required to store the data.  Primarily, it holds just a *single* list of the circuits - as opposed to an actual dictionary of `DataSet`s in which each `DataSet` contains it's own copy of the circuits.  In addition to being more space efficient, a `MultiDataSet` is able to aggregate all of its data into a single \"summed\" `DataSet` via `get_datasets_aggregate(...)`, which can be useful for combining several \"passes\" of experimental data.  \n",
    "\n",
    "Several remarks regarding a `MultiDataSet` are worth mentioning:\n",
    "- you add `DataSets` to a `MultiDataSet` using the `add_dataset` method.  However only *static* `DataSet` objects can be added.  This is because the MultiDataSet must keep all of its `DataSet`s locked to the same set of sequences, and a non-static `DataSet` allows the addition or removal of only *its* sequences.  (If the `DataSet` you want to add isn't in static-mode, call its `done_adding_data` method.)\n",
    "- square-bracket indexing accesses the `MultiDataSet` as if it were a dictionary of `DataSets`.\n",
    "- `MultiDataSets` can be loaded and saved from a single text-format file with columns for each contained `DataSet` - see `pygsti.io.load_multidataset`.\n",
    "\n",
    "Here's a brief example of using a `MultiDataSet`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MultiDataSet has 2 operation sequences and DataSet labels ['myDS', 'myDS2']\n",
      "Empty string data for myDS =  {('0',): 10.0, ('1',): 90.0}\n",
      "Empty string data for myDS2 =  {('0',): 15.0, ('1',): 85.0}\n",
      "Gx string data (no label) = {('0',): 10.0, ('1',): 90.0}\n",
      "Gx string data (no label) = {('0',): 5.0, ('1',): 95.0}\n",
      "GxGy string data for myDS = {('0',): 20.0, ('1',): 80.0}\n",
      "GxGy string data for myDS2 = {('0',): 30.0, ('1',): 70.0}\n",
      "\n",
      "Summed data:\n",
      "{}  :  {('0',): 25.0, ('1',): 175.0}\n",
      "Gx  :  {('0',): 15.0, ('1',): 185.0}\n",
      "GxGy  :  {('0',): 50.0, ('1',): 150.0}\n",
      "GxGxGxGx  :  {('0',): 60.0, ('1',): 140.0}\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from __future__ import print_function\n",
    "import pygsti\n",
    "\n",
    "multiDS = pygsti.objects.MultiDataSet()\n",
    "\n",
    "#Create some datasets                                           \n",
    "ds = pygsti.objects.DataSet(outcomeLabels=['0','1'])\n",
    "ds.add_count_dict( (), {'0': 10, '1': 90} )\n",
    "ds.add_count_dict( ('Gx',), {'0': 10, '1': 90} )\n",
    "ds.add_count_dict( ('Gx','Gy'), {'0': 20, '1': 80} )\n",
    "ds.add_count_dict( ('Gx','Gx','Gx','Gx'), {'0': 20, '1': 80} )\n",
    "ds.done_adding_data()\n",
    "\n",
    "ds2 = pygsti.objects.DataSet(outcomeLabels=['0','1'])            \n",
    "ds2.add_count_dict( (), {'0': 15, '1': 85} )\n",
    "ds2.add_count_dict( ('Gx',), {'0': 5, '1': 95} )\n",
    "ds2.add_count_dict( ('Gx','Gy'), {'0': 30, '1': 70} )\n",
    "ds2.add_count_dict( ('Gx','Gx','Gx','Gx'), {'0': 40, '1': 60} )\n",
    "ds2.done_adding_data()\n",
    "\n",
    "multiDS['myDS'] = ds\n",
    "multiDS['myDS2'] = ds2\n",
    "\n",
    "nStrs = len(multiDS)\n",
    "dslabels = list(multiDS.keys())\n",
    "print(\"MultiDataSet has %d operation sequences and DataSet labels %s\" % (nStrs, dslabels))\n",
    "    \n",
    "for dslabel in multiDS:\n",
    "    ds = multiDS[dslabel]\n",
    "    print(\"Empty string data for %s = \" % dslabel, ds[()])       \n",
    "\n",
    "for ds in multiDS.values():\n",
    "    print(\"Gx string data (no label) =\", ds[('Gx',)])     \n",
    "\n",
    "for dslabel,ds in multiDS.items():\n",
    "    print(\"GxGy string data for %s =\" % dslabel, ds[('Gx','Gy')])  \n",
    "\n",
    "dsSum = multiDS.get_datasets_aggregate('myDS','myDS2')\n",
    "print(\"\\nSummed data:\")\n",
    "print(dsSum)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loading ../../tutorial_files/TinyMultiDataset.txt: 100%\n",
      "\n",
      "Loaded from file:\n",
      "\n",
      "MultiDataSet containing: 2 datasets, each with 4 strings\n",
      " Dataset names = DS0, DS1\n",
      " Outcome labels = ('0',), ('1',)\n",
      "Gate strings: \n",
      "\n",
      "Qubit * ---|Gx|---\n",
      "\n",
      "Qubit * ---|Gx|-|Gy|---\n",
      "\n",
      "Qubit * ---|Gx|-|Gx|-|Gx|-|Gx|---\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "multi_dataset_txt = \\\n",
    "\"\"\"## Columns = DS0 0 count, DS0 1 count, DS1 0 frequency, DS1 count total                                \n",
    "{} 0 100 0 100                                                                                                      \n",
    "Gx 10 90 0.1 100                                                                                                    \n",
    "GxGy 40 60 0.4 100                                                                                                  \n",
    "Gx^4 20 80 0.2 100                                                                                                  \n",
    "\"\"\"\n",
    "\n",
    "with open(\"../../tutorial_files/TinyMultiDataset.txt\",\"w\") as output:\n",
    "    output.write(multi_dataset_txt)\n",
    "multiDS_fromFile = pygsti.io.load_multidataset(\"../../tutorial_files/TinyMultiDataset.txt\", cache=False)\n",
    "\n",
    "print(\"\\nLoaded from file:\\n\")\n",
    "print(multiDS_fromFile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Those are the basics of using `MultiDataSet`.  More information is available in the docstrings for the various `MultiDataSet` methods."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}