{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Welcome to the Lab!\n", "\n", "In your lab, you have a collection $\\{ S\\}$ of *sources* and a collection $\\{B\\} $ of *boxes*. Each box has an input port, an output port, and some number $n>1$ of lights on it. \n", "\n", "A source spits out *stuff*, which you can direct into the input port of a box, and stuff will come out the output port. When you send stuff into a box, sometimes one of the $n$ lights will blink. You can arrange the sources and the boxes in whatever pattern you like, sending stuff into boxes, and thence into other boxes. The question is: Can you predict the pattern of lights?\n", "\n", "## The Tools\n", "\n", "First, for each box, figure out which sources always spit out stuff that makes a light blink: we'll say then that a given source $S$ is *compatible* with box $B$. \n", "\n", "### Probability Vectors\n", "\n", "We can characterize the relationship between a compatible source $S$ and a box $B$ with a *probability vector*: $p(B_{i})$. Label the lights on box $B_{i}=1 \\dots n$, and send stuff from the same source $S$ into the same box $B$ many times, while keeping a tally of how many times each light blinks. Dividing each tally by the total amount of stuff you sent in gives you a probability vector such that $\\sum_{i} p(B_{i}) = 1$. \n", "\n", "### Conditional Probability Matrices\n", "\n", "Similarly, we can characterize the relationship between two boxes $A$ and $B$ with a *conditional probability matrix*: $p(B_{i}|A_{j})$. Make a table whose rows are labeled by lights of $B$ and whose columns are labeled by lights of $A$. Send stuff at random into box $A$, and then direct what comes out into box $B$, and record which lights $B_{i}, A_{j}$ blinked: put a tally mark at the $i^{th}$ row and $j^{th}$ column of your table. Finally, divide each column by the total of that column to get the conditional probability matrix. \n", "\n", "### The Law of Total Probability\n", "\n", "Suppose instead we'd sent stuff from a *single source* $S$ into $A$ and then into $B$, kept tallies of the lights, and then divided the whole matrix by the total amount of stuff we'd sent in. E.g.:\n", "\n", "$\\begin{pmatrix}p(B_{0}A_{0}) &p(B_{0}A_{1}) &p(B_{0}A_{2}) \\\\ p(B_{1}A_{0}) &p(B_{1}A_{1}) &p(B_{1}A_{2}) \\\\p(B_{2}A_{0}) &p(B_{2}A_{1}) &p(B_{2}A_{2}) \\\\\\end{pmatrix}$\n", "\n", "Then the sum of each row $i$ of this matrix would give us probabilities $p(B_{i})$, and the sum of each column $j$ would give us the probabilities $p(A_{j})$, for this situation. If we divide each column by its column sum, which is $p(A_{j})$, we'd obtain the conditional probability matrix. Thus, we can interpret our present matrix's entries as: $p(B_{i}|A_{j})p(A_{j})$. In other words, $p(B_{i}A_{j}) = p(B_{i}|A_{j})p(A_{j})$.\n", "\n", "If we sum each row, we obtain: $p(B_{i}) = \\sum_{j} p(B_{i}|A_{j})p(A_{j})$. This is known as the \"Law of Total Probability.\" In matrix notation: $\\vec{b} = [B|A]\\vec{a}$, where $\\vec{a}$ is the probability vector for $A$'s lights given the source; $[B|A]$ is the conditional probability matrix for $B$'s lights given $A$'s lights; and $\\vec{b}$ is the probability vector for $B$'s lights to blink after the stuff from $S$ has gone into $A$ and then into $B$. Naturally, if we know the conditional probability matrices for each pair in a chain of boxes, we can calculate probabilities for the lights of the final box, using the probabilities for the lights of the initial box. \n", "\n", "### Self Conditional Probability Matrices\n", "\n", "Next, we can characterize a box $B$ in its own terms by looking at its *self conditional probability matrix*, in other words, the conditional probability matrix corresponding to sending stuff into the same box twice: $p(B_{i}|B_{j})$, or $[B|B]$. We'll call the columns of this matrix the *output vectors* of $B$. \n", "\n", "#### Completely Reliable Boxes\n", "\n", "Certain boxes have a special property, namely, that $[B|B] = I$, the identity matrix. In other words, if you send stuff into such a box $B$ and the $i^{th}$ light blinks, and then you direct the stuff into another such box $B$, then the $i^{th}$ light will always blink again. To wit, if you repeat the box, you'll always get the same result. We'll call these *reliable boxes*. In general, it might be that boxes are *somewhat reliable*, in that some outcomes are reliable, but not others.\n", "\n", "#### The Dimension\n", "\n", "To each kind of stuff we can associate a number $d$, its dimension. In general, stuff can light up a box with any number of lights: but each box is only compatible with stuff if the stuff has the right dimension. To determine the dimension, sift through all the boxes compatible with a given kind of stuff: you'll find there will be a maximum number of lights $d$ such that if a compatible box has more than $d$ lights, *it can't be reliable*. In other words, the dimension $d$ (of some stuff) is the maximum number of lights possible on a reliable box compatible with that stuff.\n", "\n", "#### Informational Completeness\n", "\n", "In general, if we repeatedly send stuff from two different sources into the same box, and keep track of their two probability vectors, we might get the same probabilities for each. In that case, the box can't tell the two sources of stuff apart. We'll call a box *informationally complete*, however, if each kind of stuff gives you a signature vector of probabilities, so that, having estimated the probabilities, you'll never confuse one source for another. For a box to be informationally complete, it must have at least $d^2$ lights, where $d$ is the dimension of the stuff the box is compatible with. Note therefore that a reliable box can't be informationally complete, and an informationally complete box can't be reliable. Finally, the more outcomes a box has, in principle, the less disturbing it may be to the stuff that goes through it. \n", "\n", "#### Bias\n", "\n", "A box may be biased or unbiased: in other words, some of the box's lights may be more intrinsically likely or unlikely to blink independently of the stuff you throw into it. To determine the bias, throw stuff from many different sources, completely at random, into the box, and collect the probability vector for the lights. We'll call it $\\vec{c}$, the bias vector. If this vector is the completely even probability vector, e.g. $\\begin{pmatrix} \\frac{1}{n} \\\\ \\frac{1}{n} \\\\ \\vdots \\end{pmatrix}$ for $n$ lights, then the box is *unbiased*. This means that if we're completely uncertain about what went in, then we're completely uncertain about what light will blink. If we get some other probability vector, however, then $\\vec{c}$ tells us how the box is *biased* toward different outcomes. Either way, it'll be the case that $[B|B]\\vec{c}_{B} = \\vec{c}_{B}$.\n", "\n", "### The Metric\n", "\n", "Finally, for each box, we can construct a *metric*. Take the self conditional probability matrix $[B|B]$, and the bias vector $\\vec{c}_{B}$, and multiply each row of the matrix by $d\\vec{c}_{B}$, where $d$ is the dimension. The resulting matrix $G_{B}$ will be a symmetric matrix, as will its inverse $G_{B}^{-1}$. We'll call $G_{B}^{-1}$ the metric.\n", "\n", "Note that if a box is informationally *overcomplete* (being informationally complete, but having more than $d^2$ outcomes), then $G$ won't have a proper inverse. Instead take the pseudo-inverse: using the singular value decomposition, we can write $G = UDV$, where $U$ and $V$ are orthogonal matrices (whose inverse is their transpose), and $D$ is a diagonal matrix with the \"singular values\" along the diagonal. $D$ will have $d^2$ non-zero entries: take the reciprocals of them to form $D^{-1}$. Then: $G^{-1} = V^{T}D^{-1}U^{T}$. \n", "\n", "If you have an informationally complete box, each kind of compatible stuff can be characterized by a unique set of probabilities; but if you have an informationally overcomplete box, there will in general be multiple probability vectors which characterize the stuff, which, however, lead to equivalent predictions.\n", "\n", "### The Transition Probability\n", "\n", "Suppose we have two sources $S_{a}$ and $S_{b}$, each characterized by dimension $d$, and an informationally complete box $R$ with $d^2$ lights. We collect the probabilities $\\vec{a}$ and $\\vec{b}$ for these two sources with respect to the box $R$. We could ask the question: Suppose I send stuff $a$ into *any* box: what's the probability that stuff $b$ will come out?\n", "\n", "The metric provides us with the answer: $p_{a \\rightarrow b} = p_{b \\rightarrow a} = \\vec{b}G_{R}^{-1}\\vec{a}$.\n", "\n", "In general, this *transition probability* will depend on the number of lights on the latter box and its bias, and be proportional to the above number. But if $\\vec{b}G_{R}^{-1}\\vec{a} = 0$, then we can say that if $a$-stuff goes in, we should predict that $b$-stuff will *never* come out, and vice versa, for *any* box. If this is the case, we'll say $a$ and $b$ are *mutually exclusive*. If a box is completely reliable, then all its outcome vectors are mutually exclusive.\n", "\n", "Any box will determine a metric in this way. If the box is informationally complete, then the transition probability determined in terms of its metric will always agree with the transition probability determined using any other informationally complete box's metric. E.g., $\\vec{b}_{B}G_{B}^{-1}\\vec{a}_{B} = \\vec{b}_{C}G_{C}^{-1}\\vec{a}_{C}$ for all informationally complete boxes $B$ and $C$.\n", "\n", "#### Purity\n", "\n", "We can consider $p_{a \\rightarrow a} = \\vec{a}G_{B}^{-1}\\vec{a}$. This implies $0 \\leq \\vec{a}G_{B}^{-1}\\vec{a} \\leq 1$, since probabilities are between $0$ and $1$. \n", "\n", "If $p_{a \\rightarrow a} = 1$, then we'll call $\\vec{a}$ a *pure* probability vector. If $p_{a \\rightarrow a} < 1$, we'll call it a *mixed* probability vector. Mixed probability vectors can always be written as sums of pure vectors. \n", "\n", "The idea is that a pure vector characterizes one kind of stuff, but a mixed vector characterizes uncertainty about which kind of stuff you're dealing with. For example, if you, behind my back, fed stuff into $B$ from source $S_{a}$ $\\frac{1}{3}$ of the time and stuff from source $S_{b} \\frac{2}{3}$ of the time, and I recorded the probabilities, I'd get a mixed probability vector: e.g., $S_{a \\ or \\ b} = \\frac{1}{3}\\vec{a} + \\frac{2}{3} \\vec{b}$. \n", "\n", "Similarly, we have, on the one hand, *pure boxes* and, on the other hand, *mixed boxes*. The output vectors of pure boxes (the columns of $[A|A]$) are all pure. Another way to check if a box is pure is to see whether the diagonal of $[A|A] = d\\vec{c}_{A}$, where $\\vec{c}_{A}$ is the bias vector. If so, the box is pure.\n", "\n", "In the case of a pure box, if you see the $i^{th}$ light blink, you should wager that the stuff afterwards will be characterized by the $i^{th}$ output vector of $A$. We shall come to the case of mixed boxes later.\n", "\n", "### Passive Reference Frame Switches\n", "\n", "We've seen how by means of the metric provided by an informationally complete box (which we'll call a reference box), we can can calculate the transition probability between two kinds of stuff which is applicable *for all boxes*. \n", "\n", "Similarly, we can calculate the probabilities for the outcomes of *any box* in terms of a) the probabilities on the reference box, and b) the conditional probabilities between the reference box and the box in question.\n", "\n", "Suppose we have a reference box $R$ and another box $E$, and we have the probabilities for some stuff coming out of a source $S$ on the reference box: $\\vec{r}$. We also have the conditional probabilities for outcomes of $E$ given outcomes of $R$: $[E|R]$. Finally, we have the self conditional probability matrix for the reference box with itself: $[R|R]$. We invert this matrix: $[R|R]^{-1}$.\n", "\n", "By the law of total probability, we can calculate the probability for one of $E$'s lights to blink if stuff characterized by $\\vec{r}$ undergoes an $R$ box, and then an $E$ box. It's: $[E|R]\\vec{r}$. \n", "\n", "On the other hand, using $[R|R]^{-1}$, can calculate the probability for one of $E$'s lights to blink supposing that stuff characterized by $\\vec{r}$ goes right into $E$: $[E|R][R|R]^{-1}\\vec{r}$. \n", "\n", "Since $[R|R]\\big{(}[R|R]^{-1}\\vec{r}\\big{)} = \\vec{r}$, we can think of $[R|R]^{-1}\\vec{r}$ as the probability vector $\\vec{r}$ \"pulled back\" to before the $R$ box was performed, such that going into another $R$ box would recover the original probabilities. From this extended vantage point, \"what would have been $\\vec{r}$\" instead enters the $E$ box, characterized in terms of its relationship to $R$, giving us the probabilities for $E$'s lights to blink in terms of $R$'s, without assuming that the $R$ measurement was actually performed. We'll call this a *passive transformation*. We can think of it as a modification of the law of total probability.\n", "\n", "Read $\\vec{e}=[E|R][R|R]^{-1}\\vec{r}$ as \"what would have given probabilities $\\vec{r}$ on $R$ gives probabilities $\\vec{e}$ on $E$.\"\n", "\n", "### Active Transformations: In Between Boxes\n", "\n", "In between two boxes, it may happen that stuff changes. You can capture this change using pairs of identical reference boxes $R$, and thus calculate the result of an *active* *transformation* on probability vectors. The basic principle is that what can happen in between two identical boxes is dual to reference frame switching between two different boxes.\n", "\n", "Direct some stuff into a reference box $R$, let it transform, and afterwards send the stuff into $R$ again. Collect conditional probabilities. In a sense, we're making a new box, consisting of: whatever happens after the first $R$ box, including the second $R$ box. We'll call $R^{\\leftarrow}$. So we'll end up with $[R^{\\leftarrow}| R]$. \n", "\n", "If stuff transforms, and then goes into a box, you'll get the same effect as if the original stuff went into a box that's been transformed in the opposite direction (i.e. its outcome vectors have all been transformed in the opposite direction). For instance, hold out your thumb, pointed up. You can either a) turn it to the right, or b) turn your head to the left. With respect to the relationship between your head and your thumb, it comes to the same effect.\n", "\n", "Thus we can think of $R^{\\leftarrow}$ as $R$ transformed in the opposite way to the stuff. Thus: $\\vec{p}_{R}^{\\rightarrow} = [R^\\leftarrow |R][R|R]^{-1}\\vec{p}_{R}$. This gives us the probabilities for the stuff with respect to $R$ after the transformation, but before the second $R$ box. This is an *active transformation*. \n", "\n", "In other words, the way we should update our probability vector after a transformation, $\\vec{p}_{R}^\\rightarrow$, can be thought of as: \"what would have given probabilities $\\vec{r}$ on $R$ instead goes into $ R^{\\leftarrow}$, a box evolved in the opposite direction, and gives probabilities $\\vec{p}_{R}^\\rightarrow$\". In this sense, an active transformation is like \"would\" in reverse.\n", "\n", "#### Reversing transformations\n", "\n", "You may recall Bayes' Rule: $p(A|B) = \\frac{p(B|A)p(A)}{p(B)}$. \n", "\n", "Translated into matrix terms: $[A|B] = [A|B]^{T} \\circ |\\vec{c_{A}}\\rangle\\langle\\frac{1}{\\vec{c}_{B}}|$, where $\\vec{c}_{A}$ is the bias vector for $A$, and similarly $\\vec{c}_{B}$ is the bias vector for $B$. $|\\rangle\\langle|$ denotes the outer product, and $\\circ$ denotes the entry-wise matrix product. Naturally, this simplifies to $[B|A] = [A|B]^{T}$ in the case of unbiased boxes. \n", "\n", "****\n", "\n", "### Symmetrically Informationally Complete Boxes\n", "\n", "When it comes to reference measurements, the most beautiful kind are the most symmetrical: those with exactly $d^2$ pure elements such that the transition probabilities between them are all equal. They're called SIC's. \n", "\n", "For $d=2$, we'd have:\n", "\n", "$[R|R] = \\left[\\begin{matrix}\\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2}\\end{matrix}\\right]$\n", "\n", "The columns are the output vectors of $R$. If you do $R$ twice, half the time you'll get the same answer, or else one of the other three answers, each a third of the time. \n", "\n", "We have:\n", "\n", "$[R|R]^{-1} = \\left[\\begin{matrix}\\frac{5}{2} & - \\frac{1}{2} & - \\frac{1}{2} & - \\frac{1}{2}\\\\- \\frac{1}{2} & \\frac{5}{2} & - \\frac{1}{2} & - \\frac{1}{2}\\\\- \\frac{1}{2} & - \\frac{1}{2} & \\frac{5}{2} & - \\frac{1}{2}\\\\- \\frac{1}{2} & - \\frac{1}{2} & - \\frac{1}{2} & \\frac{5}{2}\\end{matrix}\\right]$\n", "\n", "Recall how to invert a matrix: place the identity matrix to the right of your matrix: \n", "\n", "$\\left[\\begin{matrix}\\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2}\\end{matrix}\\right] \\left[\\begin{matrix}1 & 0 & 0 & 0\\\\0 & 1 & 0 & 0\\\\0 & 0 & 1 & 0\\\\0 & 0 & 0 & 1\\end{matrix}\\right]$\n", "\n", "Then perform elementary row operations across the whole \"augmented\" matrix: swap rows, multiply or divide a row by a constant, add or subtract a multiple of a row to a row, until you get the identity matrix on the left: then $[R|R]^{-1} $ will be on the right.\n", "\n", "Notice that a SIC is unbiased:\n", "\n", "$\\left[\\begin{matrix}\\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2} & \\frac{1}{6}\\\\\\frac{1}{6} & \\frac{1}{6} & \\frac{1}{6} & \\frac{1}{2}\\end{matrix}\\right]\\left[\\begin{matrix}\\frac{1}{4}\\\\\\frac{1}{4}\\\\\\frac{1}{4}\\\\\\frac{1}{4}\\end{matrix}\\right] = \\left[\\begin{matrix}\\frac{1}{4}\\\\\\frac{1}{4}\\\\\\frac{1}{4}\\\\\\frac{1}{4}\\end{matrix}\\right] $\n", "\n", "Thus we can get $G_{R}$ by multiplying the rows of $[R|R]$ by $2\\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{4} & \\frac{1}{4} & \\frac{1}{4}\\end{matrix}\\right]$. \n", "\n", "$G_{R} = \\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{12} & \\frac{1}{12} & \\frac{1}{12}\\\\\\frac{1}{12} & \\frac{1}{4} & \\frac{1}{12} & \\frac{1}{12}\\\\\\frac{1}{12} & \\frac{1}{12} & \\frac{1}{4} & \\frac{1}{12}\\\\\\frac{1}{12} & \\frac{1}{12} & \\frac{1}{12} & \\frac{1}{4}\\end{matrix}\\right]$ and $G_{R}^{-1} =\\left[\\begin{matrix}5 & -1 & -1 & -1\\\\-1 & 5 & -1 & -1\\\\-1 & -1 & 5 & -1\\\\-1 & -1 & -1 & 5\\end{matrix}\\right] $.\n", "\n", "Notice that for each of these four matrices, the off diagonal entries are equal. In general, for any $d$, the relevant matrices for a SIC are:\n", "\n", " $[R|R]_{i=j} = \\frac{1}{d},[R|R]_{i\\neq j} = \\frac{1}{d(d+1)} $ .\n", "\n", "$[R|R]^{-1}_{i=j} = \\frac{d(d+1)-1}{d}, [R|R]^{-1}_{i\\neq j} = -\\frac{1}{d} $\n", "\n", "$G_{{R}_{i=j}} = \\frac{1}{d^2}, G_{{R}_{i\\neq j}} = \\frac{1}{d^2(d+1)}$\n", "\n", "$G^{-1}_{{R}_{i=j}} = d(d+1)-1, G^{-1}_{{R}_{i\\neq j}} = -1$\n", "\n", "As a consequence, we can make some dramatic simplifications: \n", "\n", "$p_{a \\rightarrow b} = \\vec{a}G_{R}^{-1}\\vec{b} = [d(d+1)-1]\\sum_{i} a_{i}b_{i} - \\sum_{i\\neq j} a_{i}b_{j}$.\n", "\n", "In the case of $d=2$: \n", "\n", "$p_{a \\rightarrow b} = 5\\sum_{i} a_{i}b_{i} - \\sum_{i\\neq j} a_{i}b_{j}$.\n", "\n", "So we can say, when $p_{a \\rightarrow b} = 0$:\n", "\n", "$5\\sum_{i} a_{i}b_{i} = \\sum_{i\\neq j} a_{i}b_{j}$. \n", "\n", "In other words, the transition probability is $0$ if: when you send $a$-stuff and $b$-stuff into the reference measurement $R$, you're likely to get different answers five times more often than you get the same answer. In other words, if the probability of different lights blinking when you send $a$ and $b$ into a SIC is five times the probability of the same lights blinking, then you ought to assign probability $0$ for $a$ transitioning into $b$ or vice versa, for *any box*. \n", "\n", "Of course, when we're talking about the probabilities of the same or different lights blinking, we're talking about the probabilities irrespective of the ordering of the outcomes, in other words, the tensor product of probability vectors.\n", "\n", "#### The Tensor Product\n", "\n", "We could write $\\vec{a} G_{R}^{-1} \\vec{b} = (\\vec{a} \\otimes \\vec{b}) \\cdot \\vec{G}_{R}^{-1}$, where $\\otimes$ is the tensor product and $\\vec{G}^{-1}$ is the vectorized version of $\\vec{G}^{-1}$ (all the rows of the matrix laid next to each other as a vector).\n", "\n", "For intuition about the tensor product: suppose there were four possible outcomes, and you recorded this sequence of lights from the two boxes.\n", "\n", "$$ \\begin{matrix} A & B \\\\ \n", " 1 & 1 \\\\\n", " 1 & 2 \\\\\n", " 2 & 2 \\\\\n", " 3 & 3 \\end{matrix} $$\n", " \n", "For $A$, we have $\\frac{2}{4}$ of the time $1$; $\\frac{1}{4}$ of the time $2$; and $\\frac{1}{4}$ of the time $3$. For $B$, we have $\\frac{1}{4}$ of the time $1$, $\\frac{2}{4}$ of the time $2$, and $\\frac{1}{4}$ of the time $3$.\n", "\n", "We can then consider $\\vec{a} \\otimes \\vec{b}$: $\\begin{pmatrix} \\frac{2}{4} \\\\ \\frac{1}{4} \\\\ \\frac{1}{4} \\end{pmatrix} \\otimes \\begin{pmatrix} \\frac{1}{4} \\\\ \\frac{2}{4} \\\\ \\frac{1}{4} \\end{pmatrix} = \\begin{pmatrix} \\frac{2}{16} \\\\ \\frac{4}{16} \\\\ \\frac{2}{16} \\\\ \\frac{1}{16} \\\\ \\frac{2}{16} \\\\ \\frac{1}{16} \\\\ \\frac{1}{16} \\\\ \\frac{2}{16} \\\\ \\frac{1}{16} \\end{pmatrix}$.\n", "\n", "On the other hand, we can get the same result directly from the table of outcomes, by considering them in such a way that their ordering doesn't matter. \n", "\n", "Pair each actual outcome of $A$ with each actual outcome of $B$: you'll get a data set of 16 pairs: $11, 12, 12, 13; 11, 12, 12, 13; 21, 22, 22, 23; 31, 32, 32, 33$. Then, calculating the probabilities of each possible pair from this list, we get $\\frac{2}{16}$ of the time 11, $\\frac{4}{16}$ of the time $12$, $\\dots$, just the same as in the tensor product $\\vec{a} \\otimes \\vec{b}$ above.\n", "\n", "#### The Geometry of SIC's\n", "\n", "Returning to the transition probability for a SIC, because it must be between $0$ and $1$, we have:\n", "\n", "$0 \\leq [d(d+1)-1]\\sum_{i} a_{i}b_{i} - \\sum_{i\\neq j} a_{i}b_{j} \\leq 1$\n", "\n", "$$ 0 \\leq d(d+1)\\sum_{i} a_{i}b_{i} - \\sum_{i} a_{i}b_{i} - \\sum_{i \\neq j} a_{i}b_{j} \\leq 1 $$\n", "\n", "$$ 0 \\leq d(d+1)\\sum_{i} a_{i}b_{i} - \\sum_{i,j} a_{i}b_{j} \\leq 1 $$\n", "\n", "$$ 0 \\leq d(d+1)\\sum_{i} a_{i}b_{i} - 1 \\leq 1 $$\n", "\n", "$$ \\frac{1}{d(d+1)} \\leq \\sum_{i} a_{i}b_{i} \\leq \\frac{2}{d(d+1)} $$\n", "\n", "\n", "$$ \\frac{1}{d(d+1)} \\leq \\vec{a} \\cdot \\vec{b} \\leq \\frac{2}{d(d+1)} $$\n", "\n", "In particlar, $\\vec{a} \\cdot \\vec{a} \\leq \\frac{2}{d(d+1)} $, so that we have $|\\vec{a}| \\leq \\sqrt{\\frac{2}{d(d+1)}}$. In other words, in addition to $\\sum_{i} a_{i} = 1$, we have $\\sum_{i} a_{i}^2 \\leq \\frac{2}{d(d+1)}$ constraining our SIC probability vectors. In the case of a pure vector, we have equality: so that the length of any pure probability vector is a constant relating to the dimension. Therefore the pure vectors live on the surface of a *sphere* of that radius, and mixed vectors live within the sphere. \n", "\n", "Indeed, this sphere is a subset of the space of probabilities, the latter called the *probability simplex.* For $d=2$, this sphere is inscribed in the probability simplex, and just touches it. In higher dimensions, the sphere bubbles out of the edges of the simplex, so that not all of the sphere is in the probability simplex. The outcome vectors of a SIC form a mini-simplex whose vertices are pure states that lie on the surface of the pure sphere in the part of the sphere contained by the probability simplex. One implication of all this is that a SIC probability vector won't have an element that exceeds $\\frac{1}{d}$, and the total number of zero entries can't be greater than $\\frac{d(d-1)}{2}$. \n", "\n", "If $R$ is a SIC, we can also simplify our modified law of total probability, i.e., our rule for a passive reference frame switch, from $\\vec{e} = [E|R][R|R]^{-1}\\vec{r}$ to:\n", "\n", "$ p(E_{i}) = \\sum_{j} [(d+1)p(R_{j}) - \\frac{1}{d}]p(E_{i}|R_{j})$.\n", "\n", "In the special case where $E$ is a reliable box, having mutually exclusive outcomes, then $\\vec{e}= (d+1)[E|R]\\vec{r} -1$. \n", "\n", "### Informational Completeness with Reliable Boxes\n", "\n", "For a given dimension $d$, there are informationally complete boxes with at least $d^2$ elements. These single boxes can be used as reference boxes to compute probabilities for what will happen if the stuff goes into any other box. But because they are informationally complete, they must be unreliable. On the other hand, one can achieve informational completeness using reliable boxes, but this requires sending stuff from the same source into multiple reliable boxes, whose probabilities all together will characterize the stuff.\n", "\n", "For example, in the case of $d=2$, consider three boxes we'll call $X, Y$, and $Z$. Each has two lights, which we'll denote $\\uparrow$ and $\\downarrow$. Their metrics are all the identity matrix: they are reliable boxes. We can characterize them with reference to a SIC in $d=2$, which has $4$ outcomes.\n", "\n", "For example:\n", "\n", "$ [Z|R] = \\left[\\begin{matrix}1 & \\frac{1}{3} & \\frac{1}{3} & \\frac{1}{3}\\\\0 & \\frac{2}{3} & \\frac{2}{3} & \\frac{2}{3}\\end{matrix}\\right]$ , $[R|Z]=\\left[\\begin{matrix}\\frac{1}{2} & 0\\\\\\frac{1}{6} & \\frac{1}{3}\\\\\\frac{1}{6} & \\frac{1}{3}\\\\\\frac{1}{6} & \\frac{1}{3}\\end{matrix}\\right]$\n", "\n", "$ [Y|R] = \\left[\\begin{matrix}\\frac{1}{2} & \\frac{1}{2} & \\frac{1}{\\sqrt{6}} + \\frac{1}{2} & \\frac{1}{2} - \\frac{1}{\\sqrt{6}}\\\\\\frac{1}{2} & \\frac{1}{2} & \\frac{1}{2} - \\frac{1}{\\sqrt{6}} & \\frac{1}{\\sqrt{6}} + \\frac{1}{2}\\end{matrix}\\right]$ , $[R|Y] = \\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{4}\\\\\\frac{1}{4} & \\frac{1}{4}\\\\\\frac{\\sqrt{6}}{12} + \\frac{1}{4} & \\frac{1}{4} - \\frac{\\sqrt{6}}{12}\\\\\\frac{1}{4} - \\frac{\\sqrt{6}}{12} & \\frac{\\sqrt{6}}{12} + \\frac{1}{4}\\end{matrix}\\right]$\n", "\n", "$[X|R]=\\left[\\begin{matrix}\\frac{1}{2} & \\frac{\\sqrt{2}}{3} + \\frac{1}{2} & \\frac{1}{2} - \\frac{\\sqrt{2}}{6} & \\frac{1}{2} - \\frac{\\sqrt{2}}{6}\\\\\\frac{1}{2} & \\frac{1}{2} - \\frac{\\sqrt{2}}{3} & \\frac{\\sqrt{2}}{6} + \\frac{1}{2} & \\frac{\\sqrt{2}}{6} + \\frac{1}{2}\\end{matrix}\\right]$, $[R|X] = \\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{4}\\\\\\frac{\\sqrt{2}}{6} + \\frac{1}{4} & \\frac{1}{4} - \\frac{\\sqrt{2}}{6}\\\\\\frac{1}{4} - \\frac{\\sqrt{2}}{12} & \\frac{\\sqrt{2}}{12} + \\frac{1}{4}\\\\\\frac{1}{4} - \\frac{\\sqrt{2}}{12} & \\frac{\\sqrt{2}}{12} + \\frac{1}{4}\\end{matrix}\\right]$\n", "\n", "So: the columns of $[Z|R]$ tell us the probabilities that a $Z$ box will give $\\uparrow$ or $\\downarrow$ for each of the SIC outcome vectors. Given some probability vector with respect to $R$, we can calculate the probabilities for $Z$ outcomes via $[Z|R][R|R]^{-1}\\vec{r}$. \n", "\n", "In contrast, the columns of $[R|Z]$ characterize the stuff that comes out of the $Z$ box with reference to $R$: outcome vectors $Z_{\\uparrow}$ and $Z_{\\downarrow}$. (In this case, things are aligned so that one of the outcome vectors of $Z$, in fact, corresponds to one of the SIC outcome vectors.) \n", "\n", "We can calculate the probabilities for $R$'s lights to blink given a vector of $Z$ probabilities with:\n", "\n", " $[R|Z][Z|Z]^{-1}\\vec{z} = [R|Z]\\vec{z}$\n", "\n", "Since the box is reliable, $[Z|Z] = I$. Thus we can use the unmodified law of total probability to calculate $R$ probabilities from $Z$ probabilities: no woulds necessary.\n", "\n", "#### Mututally Unbiased Reliable Boxes\n", "\n", "$X$, $Y$, and $Z$ have a special property. If we look at the conditional probability matrices $[X|Y], [X|Z], [Y|Z]$, and so on, we'll find: \n", "\n", "$[X|Y] = [X|R][R|R]^{-1}[R|Y] = \\left[\\begin{matrix}\\frac{1}{2} & \\frac{1}{2}\\\\\\frac{1}{2} & \\frac{1}{2}\\end{matrix}\\right]$.\n", "\n", "In other words, if something comes out of an $X$ box, and you send it through another $X$ box, it'll always give the same result. But if you instead send the stuff through an $X$ box and then a $Y$ box, you'll get $Y_{\\uparrow}$ or $Y_\\downarrow$ each with equal probability. Similarly, if you send the stuff through a $Z$ box, and then an $Y$ box, or any pair. Thus, these three reliable boxes, each associated with two mutually exclusive outcomes, are also *unbiased* or *complementary* with respect to each other. \n", "\n", "#### The Expected Value\n", "\n", "Suppose we assigned a valuation, a number to each outcome of our three boxes $X, Y, Z$: $\\{\\uparrow, \\downarrow\\} \\rightarrow \\{1,-1\\}$, so that each set of lights is associated with a \"weight vector\" $\\vec{\\lambda} = \\left[ \\begin{matrix}1 \\\\ -1 \\end{matrix}\\right] $. We can then assign an expected value for $X$ in terms of our reference probabilities.\n", "\n", "$\\langle X \\rangle = 1\\cdot p(X_\\uparrow) + (-1)\\cdot p(X_\\downarrow) = \\vec{\\lambda}[X|R][R|R]^{-1}\\vec{r}$\n", "\n", "In turns out that stuff with $d=2$ can also be fully characterized by the vector of expectation values $( \\langle X \\rangle,\\langle Y \\rangle,\\langle Z \\rangle )$. In fact, just as in the SIC representation, pure states live on the surface of a sphere, now in three dimensions, and mixed states live in the interior of the sphere. Each box $X, Y,$ or $Z$, has two outcome vectors, and in each case, the two outcome vectors correspond to antipodal points on the sphere, fixing three orthogonal axes. \n", "\n", "Moreover, if you calculate the $(\\langle X\\rangle, \\langle Y \\rangle, \\langle Z \\rangle)$ vectors corresponding to our SIC outcome vectors, you'll find that they make a tetrahedron that fits snugly in the sphere. It's worth mentioning, then, that SIC box for a given dimension is not unique: in this case, there's a whole sphere's worth of them, since rotating the tetrahedron as a whole won't change the angles between the vectors. Thus if you find two SIC's in $d=2$ such that they give different probabilities for the same stuff, then they must be related by a 3D rotation.\n", "\n", "Geometrically, if you have two kinds of stuff to which you've assigned 3-vectors $\\vec{a}_{xyz}$ and $\\vec{b}_{xyz}$, you can calculate the transition probability between them via $p_{a \\rightarrow b} = \\frac{1}{2}(1 + \\vec{a}_{xyz} \\cdot \\vec{b}_{xyz})$. This means that if two points are antipodal on the sphere, they have 0 probability of transitioning from one to another. \n", "\n", "Finally, the $X, Y, Z$ coordinates can be given in terms of our SIC probabilities: \n", "\n", "$\\left[\\langle X\\rangle, \\langle Y \\rangle, \\langle Z \\rangle \\right]=\\left[ \\left[\\begin{matrix}\\sqrt{2} \\left(2 p_{2} - p_{3} - p_{4}\\right)\\end{matrix}\\right], \\ \\left[\\begin{matrix}\\sqrt{6} \\left(p_{3} - p_{4}\\right)\\end{matrix}\\right], \\ \\left[\\begin{matrix}3 p_{1} - p_{2} - p_{3} - p_{4}\\end{matrix}\\right]\\right]$\n", "\n", "#### Time Reversal (for SIC's)\n", "\n", "Suppose I have a second SIC rotated around the $X$ axis 90 degrees from the first, so that $[0,0,1]$ goes to $[0,1,0]$. If I collect the conditional probabilities for outcomes of the rotated SIC given outcomes of the original SIC, I'd end up with:\n", "\n", "$[R^\\prime|R] = \\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{4} & \\frac{\\sqrt{6}}{12} + \\frac{1}{4} & \\frac{1}{4} - \\frac{\\sqrt{6}}{12}\\\\\\frac{1}{4} & \\frac{17}{36} & \\frac{5}{36} - \\frac{\\sqrt{6}}{36} & \\frac{\\sqrt{6}}{36} + \\frac{5}{36}\\\\\\frac{1}{4} - \\frac{\\sqrt{6}}{12} & \\frac{\\sqrt{6}}{36} + \\frac{5}{36} & \\frac{11}{36} & \\frac{\\sqrt{6}}{18} + \\frac{11}{36}\\\\\\frac{\\sqrt{6}}{12} + \\frac{1}{4} & \\frac{5}{36} - \\frac{\\sqrt{6}}{36} & \\frac{11}{36} - \\frac{\\sqrt{6}}{18} & \\frac{11}{36}\\end{matrix}\\right]$\n", "\n", "It's worth observing, like many of the matrices we've dealt with, that this is a stochastic matrix, in fact a doubly stochastic matrix: its rows and columns all sum to 1. (A *right* stochastic matrix has rows which sum to 1, and a *left* stochastic matrix has columns that sum to 1). The point of stochastic matrices is that they preserve the fact that probabilities must sum to 1.\n", "\n", "As we've noted, we can calculate the probabilities with respect to the rotated SIC via: $\\vec{r}^\\prime = [R^\\prime|R][R|R]^{-1}\\vec{r}$. \n", "\n", "We can also consider the opposite rotation, its time reverse: e.g. a rotation from $[0,1,0]$ to $[0,0,1]$, by considering $[R|R^\\prime]$. You'll get:\n", "\n", "$[R|R^\\prime] = \\left[\\begin{matrix}\\frac{1}{4} & \\frac{1}{4} & \\frac{1}{4} - \\frac{\\sqrt{6}}{12} & \\frac{\\sqrt{6}}{12} + \\frac{1}{4}\\\\\\frac{1}{4} & \\frac{17}{36} & \\frac{\\sqrt{6}}{36} + \\frac{5}{36} & \\frac{5}{36} - \\frac{\\sqrt{6}}{36}\\\\\\frac{\\sqrt{6}}{12} + \\frac{1}{4} & \\frac{5}{36} - \\frac{\\sqrt{6}}{36} & \\frac{11}{36} & \\frac{11}{36} - \\frac{\\sqrt{6}}{18}\\\\\\frac{1}{4} - \\frac{\\sqrt{6}}{12} & \\frac{\\sqrt{6}}{36} + \\frac{5}{36} & \\frac{\\sqrt{6}}{18} + \\frac{11}{36} & \\frac{11}{36}\\end{matrix}\\right]$\n", "\n", "Notice that $[R|R^\\prime] = [R^\\prime|R]^{T}$. As we know, this will be true if you're using an unbiased reference box.\n", "\n", "Interestingly, in general, even for unbiased boxes, although $[R|R] = [R^\\prime | R^\\prime ]$:\n", "\n", " $[R|R^\\prime] [R^\\prime|R^\\prime]^{-1} \\neq \\left( [R^\\prime|R][R|R]^{-1} \\right)^{T}$.\n", "\n", "But this *is* true for a SIC.\n", "\n", "