{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python and numpy bool Types"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This blog post is triggered by a colleague stopping me in the hall and asking \"What does `~` do in Python?\" She was surprised by the behavior of the `~` operator when applied to Python bool types and I was surprised that it behaved differently on numpy bools than on Python bools. All in all enough surprises to write a short blog post about the difference between the two variable types."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The `~` Operator\n",
    "\n",
    "Let's start with the original question, what does `~n` do in Python? Answer: [It inverts the bits of `n`](https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types), where `n` is an integer. So for example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 85 in binary: +1010101\n",
      "~85 in binary: -1010110 is -86 in integer\n"
     ]
    }
   ],
   "source": [
    "n = 85\n",
    "print(\" {0:d} in binary: {0:+b}\".format(n))\n",
    "print(\"~{0:d} in binary: {1:+b} is {1:d} in integer\".format(n, ~n))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You may find it surprising that `~85` does not return the bit pattern `0101010` but this is just due to the [two's complement](https://wiki.python.org/moin/BitwiseOperators) representation of integers in Python. \n",
    "\n",
    "## Python bool\n",
    "\n",
    "Understanding two's complement and knowing that Python `bool`s are a subclass of `int`, it is not surprising that"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " True in binary:  0b1\n",
      "~True in binary: -0b10 is -2 in integer\n"
     ]
    }
   ],
   "source": [
    "print(\" True in binary:  {:s}\".format(bin(True)))\n",
    "print(\"~True in binary: {:s} is {:d} in integer\".format(bin(~True), ~True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and so the truth value of `~True` is `True`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n"
     ]
    }
   ],
   "source": [
    "print(bool(~True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "because any integer other than zero evaluates to `True`. This may come as a surprise if you are not aware that bools in Python are in fact integers, which use two's complement. It's even a little bit more confusing because"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-1 True\n"
     ]
    }
   ],
   "source": [
    "print(~False, bool(~False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`~False` in fact evaluates to True. If you want to correctly negate Python boolean values use logical `not` and not bitwise not (`~`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False True\n"
     ]
    }
   ],
   "source": [
    "print(not True, not False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Numpy bool\n",
    "\n",
    "What surprised *me* was that numpy bools show a different behavior:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ True  True  True  True  True  True  True  True  True  True]\n",
      "[False False False False False False False False False False]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "a = np.ones(10, dtype=bool)\n",
    "print(a)\n",
    "print(~a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The reason for this is that numpy bools are an entirely different type. They are not an subclass of Python bools and they are also not a subclass of any numeric type. This is all clearly stated in the [numpy reference manual](https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html#built-in-scalar-types) even with the following warning\n",
    "\n",
    "> The bool\\_ type is not a subclass of the int\\_ type (the bool\\_ is not even a number type). This is different than Python’s default implementation of bool as a sub-class of int.\n",
    "\n",
    "yet reading this without this example I didn't fully understand the consequences.\n",
    "\n",
    "In numpy we can make things even a little more convoluted if we mix Python bools and numpy.bool\\_ in an object array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ True  True False  True]\n",
      "[ True  True  True False]\n"
     ]
    }
   ],
   "source": [
    "b = np.array([True, True, False, np.True_], dtype=object)\n",
    "print(b.astype(np.bool))\n",
    "print((~b).astype(np.bool))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "My advise above to use logical `not` also does not work for numpy arrays because `not` is not applied element-wise but tries to evaluate the boolean value of the entire array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-8-c236fbab03dc>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mnot\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
     ]
    }
   ],
   "source": [
    "print(not b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The correct thing to do for numpy arrays is to use the ufunc `logical_not`, which gives the expected result for both our arrays `a` and `b`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Array a\n",
      "[ True  True  True  True  True  True  True  True  True  True]\n",
      "[False False False False False False False False False False]\n",
      "\n",
      "Array b\n",
      "[True True False True]\n",
      "[False False True False]\n"
     ]
    }
   ],
   "source": [
    "print(\"Array a\")\n",
    "print(a)\n",
    "print(np.logical_not(a))\n",
    "print(\"\\nArray b\")\n",
    "print(b)\n",
    "print(np.logical_not(b))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<small>\n",
    "<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\">This blog post is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\">Creative Commons Attribution-ShareAlike 4.0 International License</a>. This post was written as a <a href=\"http://www.jupyter.org/\">Jupyter Notebook</a> in Python. You can <a href=\"https://raw.githubusercontent.com/joergdietrich/joergdietrich.github.io/master/notebooks/BoolTypes.ipynb\">download</a> the original notebook.</small>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}