{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Parallel GST using MPI\n", "The purpose of this tutorial is to demonstrate how to compute GST estimates in parallel (using multiple CPUs or \"processors\"). The core PyGSTi computational routines are written to take advantage of multiple processors via the MPI communication framework, and so one must have a version of MPI and the `mpi4py` python package installed in order use run pyGSTi calculations in parallel. \n", "\n", "Since `mpi4py` doesn't play nicely with Jupyter notebooks, this tutorial is a bit more clunky than the others. In it, we will create a standalone Python script that imports `mpi4py` and execute it.\n", "\n", "We will use as an example the same \"standard\" single-qubit gate set of the first tutorial. We'll first create a dataset, and then a script to be run in parallel which loads the data. The creation of a simulated data is performed in the same way as the first tutorial. Since *random* numbers are generated and used as simulated counts within the call to `generate_fake_data`, it is important that this is *not* done in a parallel environment, or different CPUs may get different data sets. (This isn't an issue in the typical situation when the data is obtained experimentally.)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "#Import pyGSTi and the \"stardard 1-qubit quantities for a gateset with X(pi/2), Y(pi/2), and idle gates\"\n", "import pygsti\n", "from pygsti.construction import std1Q_XYI\n", "\n", "#Create a data set\n", "gs_target = std1Q_XYI.gs_target\n", "fiducials = std1Q_XYI.fiducials\n", "germs = std1Q_XYI.germs\n", "maxLengths = [1,2,4,8,16,32]\n", "\n", "gs_datagen = gs_target.depolarize(gate_noise=0.1, spam_noise=0.001)\n", "listOfExperiments = pygsti.construction.make_lsgst_experiment_list(gs_target.gates.keys(), fiducials, fiducials, germs, maxLengths)\n", "ds = pygsti.construction.generate_fake_data(gs_datagen, listOfExperiments, nSamples=1000,\n", " sampleError=\"binomial\", seed=1234)\n", "pygsti.io.write_dataset(\"example_files/mpi_example_dataset.txt\", ds)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Next, we'll write a Python script that will load in the just-created `DataSet`, run GST on it, and write the output to a file. The only major difference between the contents of this script and previous examples is that the script imports `mpi4py` and passes a MPI comm object (`comm`) to the `do_long_sequence_gst` function. Since parallel computing is best used for computationaly intensive GST calculations, we also demonstrate how to set a per-processor memory limit to tell pyGSTi to partition its computations so as to not exceed this memory usage. Lastly, note the use of the `gaugeOptParams` argument of `do_long_sequence_gst`, which can be used to weight different gate set members differently during gauge optimization." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [], "source": [ "mpiScript = \"\"\"\n", "import time\n", "import pygsti\n", "from pygsti.construction import std1Q_XYI\n", "\n", "#get MPI comm\n", "from mpi4py import MPI\n", "comm = MPI.COMM_WORLD\n", "\n", "print(\"Rank %d started\" % comm.Get_rank())\n", "\n", "#define target gateset, fiducials, and germs as before\n", "gs_target = std1Q_XYI.gs_target\n", "fiducials = std1Q_XYI.fiducials\n", "germs = std1Q_XYI.germs\n", "maxLengths = [1,2,4,8,16,32]\n", "\n", "#tell gauge optimization to weight the gate matrix\n", "# elements 100x more heavily than the SPAM vector elements, and\n", "# to specifically weight the Gx gate twice as heavily as the other\n", "# gates.\n", "goParams = {'itemWeights':{'spam': 0.01, 'gates': 1.0, 'Gx': 2.0} }\n", "\n", "#Specify a per-core memory limit (useful for larger GST calculations)\n", "memLim = 2.1*(1024)**3 # 2.1 GB\n", "\n", "#Perform TP-constrained GST\n", "gs_target.set_all_parameterizations(\"TP\")\n", " \n", "#load the dataset\n", "ds = pygsti.io.load_dataset(\"example_files/mpi_example_dataset.txt\")\n", "\n", "start = time.time()\n", "results = pygsti.do_long_sequence_gst(ds, gs_target, fiducials, fiducials,\n", " germs, maxLengths,memLimit=memLim,\n", " gaugeOptParams=goParams, comm=comm,\n", " verbosity=2)\n", "end = time.time()\n", "print(\"Rank %d finished in %.1fs\" % (comm.Get_rank(), end-start))\n", "if comm.Get_rank() == 0:\n", " import pickle\n", " pickle.dump(results, open(\"example_files/mpi_example_results.pkl\",\"wb\"))\n", "\"\"\"\n", "with open(\"example_files/mpi_example_script.py\",\"w\") as f:\n", " f.write(mpiScript)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Next, we run the script with 3 processors using `mpiexec`. The `mpiexec` executable should have been installed with your MPI distribution -- if it doesn't exist, try replacing `mpiexec` with `mpirun`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rank 2 started\n", "Rank 0 started\n", "Rank 1 started\n", "--- Gate Sequence Creation ---\n", " 1702 sequences created\n", " Dataset has 1702 entries: 1702 utilized, 0 requested sequences were missing\n", "--- LGST ---\n", " Singular values of I_tilde (truncating to first 4 of 6) = \n", " 4.244089943192679\n", " 1.1594632778409213\n", " 0.965151667073796\n", " 0.9297628363691258\n", " 0.04925681134723816\n", " 0.02515065837213689\n", " \n", " Singular values of target I_tilde (truncating to first 4 of 6) = \n", " 4.242640687119285\n", " 1.4142135623730954\n", " 1.4142135623730947\n", " 1.4142135623730945\n", " 3.1723744950054595e-16\n", " 1.0852733691121267e-16\n", " \n", "--- Iterative MLGST: Iter 1 of 6 92 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.14, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 95.7297 (91 data params - 43 model params = expected mean of 48; p-value = 5.13941e-05)\n", " Completed in 0.2s\n", " 2*Delta(log(L)) = 96.1163\n", " Iteration 1 took 0.2s\n", " \n", "--- Iterative MLGST: Iter 2 of 6 168 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.14, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 162.192 (167 data params - 43 model params = expected mean of 124; p-value = 0.0120809)\n", " Completed in 0.4s\n", " 2*Delta(log(L)) = 162.56\n", " Iteration 2 took 0.4s\n", " \n", "--- Iterative MLGST: Iter 3 of 6 450 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.14, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 484.676 (449 data params - 43 model params = expected mean of 406; p-value = 0.00435191)\n", " Completed in 0.7s\n", " 2*Delta(log(L)) = 485.572\n", " Iteration 3 took 0.8s\n", " \n", "--- Iterative MLGST: Iter 4 of 6 862 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.15, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 895.303 (861 data params - 43 model params = expected mean of 818; p-value = 0.0307097)\n", " Completed in 1.3s\n", " 2*Delta(log(L)) = 896.23\n", " Iteration 4 took 1.4s\n", " \n", "--- Iterative MLGST: Iter 5 of 6 1282 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.15, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 1350.86 (1281 data params - 43 model params = expected mean of 1238; p-value = 0.0133489)\n", " Completed in 2.0s\n", " 2*Delta(log(L)) = 1351.9\n", " Iteration 5 took 2.1s\n", " \n", "--- Iterative MLGST: Iter 6 of 6 1702 gate strings ---: \n", " --- Minimum Chi^2 GST ---\n", " Memory limit = 2.10GB\n", " Cur, Persist, Gather = 0.16, 0.00, 0.21 GB\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Sum of Chi^2 = 1800.55 (1701 data params - 43 model params = expected mean of 1658; p-value = 0.00777432)\n", " Completed in 3.2s\n", " 2*Delta(log(L)) = 1801.81\n", " Iteration 6 took 3.3s\n", " \n", " Switching to ML objective (last iteration)\n", " --- MLGST ---\n", " Memory: limit = 2.10GB(cur, persist, gthr = 0.16, 0.00, 0.21 GB)\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Maximum log(L) = 900.852 below upper bound of -2.84686e+06\n", " 2*Delta(log(L)) = 1801.7 (1701 data params - 43 model params = expected mean of 1658; p-value = 0.00737681)\n", " Completed in 1.2s\n", " 2*Delta(log(L)) = 1801.7\n", " Final MLGST took 1.2s\n", " \n", "Iterative MLGST Total Time: 9.4s\n", " -- Adding Gauge Optimized (go0) --\n", "--- Re-optimizing logl after robust data scaling ---\n", " --- MLGST ---\n", " Memory: limit = 2.10GB(cur, persist, gthr = 0.16, 0.00, 0.21 GB)\n", " Finding num_nongauge_params is too expensive: using total params.\n", " Maximum log(L) = 900.852 below upper bound of -2.84686e+06\n", " 2*Delta(log(L)) = 1801.7 (1701 data params - 43 model params = expected mean of 1658; p-value = 0.00737681)\n", " Completed in 0.9s\n", " -- Adding Gauge Optimized (go0) --\n", "Rank 2 finished in 14.5s\n", "Rank 0 finished in 14.1s\n", "Rank 1 finished in 14.4s\n", "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/plotly/tools.py:102: UserWarning:\n", "\n", "Looks like you don't have 'read-write' permission to your 'home' ('~') directory or to our '~/.plotly' directory. That means plotly's python api can't setup local configuration files. No problem though! You'll just have to sign-in using 'plotly.plotly.sign_in()'. For help with that: 'help(plotly.plotly.sign_in)'.\n", "Questions? Visit https://support.plot.ly\n", "\n" ] } ], "source": [ "! mpiexec -n 3 python3 \"example_files/mpi_example_script.py\"" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Notice in the above that output within `do_long_sequence_gst` is not duplicated (only the first processor outputs to stdout) so that the output looks identical to running on a single processor. Finally, we just need to read the pickled `Results` object from file and proceed with any post-processing analysis. In this case, we'll just create a report. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "*** Creating workspace ***\n", "*** Generating switchboard ***\n", "Found standard clifford compilation from std1Q_XYI\n", "*** Generating tables ***\n", " targetSpamBriefTable took 1.609668 seconds\n", " targetGatesBoxTable took 0.42235 seconds\n", " datasetOverviewTable took 0.310447 seconds\n", " bestGatesetSpamParametersTable took 0.001289 seconds\n", " bestGatesetSpamBriefTable took 0.626723 seconds\n", " bestGatesetSpamVsTargetTable took 0.484869 seconds\n", " bestGatesetGaugeOptParamsTable took 0.000917 seconds\n", " bestGatesetGatesBoxTable took 0.809621 seconds\n", " bestGatesetChoiEvalTable took 1.067601 seconds\n", " bestGatesetDecompTable took 0.480324 seconds\n", " bestGatesetEvalTable took 0.008096 seconds\n", " bestGermsEvalTable took 0.025843 seconds\n", " bestGatesetVsTargetTable took 0.458651 seconds\n", " bestGatesVsTargetTable_gv took 1.242838 seconds\n", " bestGatesVsTargetTable_gvgerms took 0.182804 seconds\n", " bestGatesVsTargetTable_gi took 0.016915 seconds\n", " bestGatesVsTargetTable_gigerms took 0.036738 seconds\n", " bestGatesVsTargetTable_sum took 1.212707 seconds\n", " bestGatesetErrGenBoxTable took 2.108415 seconds\n", " metadataTable took 0.264981 seconds\n", " stdoutBlock took 0.002612 seconds\n", " profilerTable took 0.001554 seconds\n", " softwareEnvTable took 0.099257 seconds\n", " exampleTable took 0.160756 seconds\n", " singleMetricTable_gv took 1.370196 seconds\n", " singleMetricTable_gi took 0.021888 seconds\n", " fiducialListTable took 0.001254 seconds\n", " prepStrListTable took 0.002327 seconds\n", " effectStrListTable took 0.000343 seconds\n", " colorBoxPlotKeyPlot took 0.168866 seconds\n", " germList2ColTable took 0.000524 seconds\n", " progressTable took 6.267582 seconds\n", "*** Generating plots ***\n", " gramBarPlot took 0.183764 seconds\n", " progressBarPlot took 5.912556 seconds\n", " progressBarPlot_sum took 0.000821 seconds\n", " finalFitComparePlot took 2.544998 seconds\n", " bestEstimateColorBoxPlot took 18.337868 seconds\n", " bestEstimateTVDColorBoxPlot took 15.348104 seconds\n", " bestEstimateColorScatterPlot took 21.429121 seconds\n", " bestEstimateColorHistogram took 13.44426 seconds\n", " progressTable_scl took 3.952974 seconds\n", " progressBarPlot_scl took 3.979353 seconds\n", " bestEstimateColorBoxPlot_scl took 12.154327 seconds\n", " bestEstimateColorScatterPlot_scl took 15.917986 seconds\n", " bestEstimateColorHistogram_scl took 12.474453 seconds\n", " dataScalingColorBoxPlot took 0.353511 seconds\n", "*** Merging into template file ***\n", " Rendering topSwitchboard took 0.00021 seconds\n", " Rendering maxLSwitchboard1 took 0.000168 seconds\n", " Rendering targetSpamBriefTable took 0.02041 seconds\n", " Rendering targetGatesBoxTable took 0.013005 seconds\n", " Rendering datasetOverviewTable took 0.001363 seconds\n", " Rendering bestGatesetSpamParametersTable took 0.003427 seconds\n", " Rendering bestGatesetSpamBriefTable took 0.034344 seconds\n", " Rendering bestGatesetSpamVsTargetTable took 0.0048 seconds\n", " Rendering bestGatesetGaugeOptParamsTable took 0.002043 seconds\n", " Rendering bestGatesetGatesBoxTable took 0.029497 seconds\n", " Rendering bestGatesetChoiEvalTable took 0.02538 seconds\n", " Rendering bestGatesetDecompTable took 0.02282 seconds\n", " Rendering bestGatesetEvalTable took 0.028038 seconds\n", " Rendering bestGermsEvalTable took 0.105653 seconds\n", " Rendering bestGatesetVsTargetTable took 0.002436 seconds\n", " Rendering bestGatesVsTargetTable_gv took 0.008389 seconds\n", " Rendering bestGatesVsTargetTable_gvgerms took 0.014594 seconds\n", " Rendering bestGatesVsTargetTable_gi took 0.005182 seconds\n", " Rendering bestGatesVsTargetTable_gigerms took 0.006462 seconds\n", " Rendering bestGatesVsTargetTable_sum took 0.007368 seconds\n", " Rendering bestGatesetErrGenBoxTable took 0.047871 seconds\n", " Rendering metadataTable took 0.007657 seconds\n", " Rendering stdoutBlock took 0.001859 seconds\n", " Rendering profilerTable took 0.003063 seconds\n", " Rendering softwareEnvTable took 0.005588 seconds\n", " Rendering exampleTable took 0.00715 seconds\n", " Rendering metricSwitchboard_gv took 7e-05 seconds\n", " Rendering metricSwitchboard_gi took 6.8e-05 seconds\n", " Rendering singleMetricTable_gv took 0.014464 seconds\n", " Rendering singleMetricTable_gi took 0.009797 seconds\n", " Rendering fiducialListTable took 0.004594 seconds\n", " Rendering prepStrListTable took 0.00378 seconds\n", " Rendering effectStrListTable took 0.003706 seconds\n", " Rendering colorBoxPlotKeyPlot took 0.007593 seconds\n", " Rendering germList2ColTable took 0.007473 seconds\n", " Rendering progressTable took 0.010774 seconds\n", " Rendering gramBarPlot took 0.005781 seconds\n", " Rendering progressBarPlot took 0.004351 seconds\n", " Rendering progressBarPlot_sum took 0.004597 seconds\n", " Rendering finalFitComparePlot took 0.0039 seconds\n", " Rendering bestEstimateColorBoxPlot took 0.131067 seconds\n", " Rendering bestEstimateTVDColorBoxPlot took 0.101251 seconds\n", " Rendering bestEstimateColorScatterPlot took 0.121297 seconds\n", " Rendering bestEstimateColorHistogram took 0.070754 seconds\n", " Rendering progressTable_scl took 0.009301 seconds\n", " Rendering progressBarPlot_scl took 0.00391 seconds\n", " Rendering bestEstimateColorBoxPlot_scl took 0.12323 seconds\n", " Rendering bestEstimateColorScatterPlot_scl took 0.147049 seconds\n", " Rendering bestEstimateColorHistogram_scl took 0.092682 seconds\n", " Rendering dataScalingColorBoxPlot took 0.033547 seconds\n", "Output written to example_files/mpi_example_brief directory\n", "Opening example_files/mpi_example_brief/main.html...\n", "*** Report Generation Complete! Total time 147.852s ***\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pickle\n", "results = pickle.load(open(\"example_files/mpi_example_results.pkl\",\"rb\"))\n", "pygsti.report.create_standard_report(results, \"example_files/mpi_example_brief\",\n", " title=\"MPI Example Report\", verbosity=2, auto_open=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }