{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d73d685e-8834-44ce-a75f-9ccbcefba1cf",
   "metadata": {},
   "source": [
    "# Numpy Arrays\n",
    "\n",
    "## Goals\n",
    "\n",
    "* For beginners, get a sense of how an array can be used.\n",
    "* For more experienced practitioners, fill in a deeper understanding of how arrays work and perhaps see one or two useful new things."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d7f6f7e1-089e-42aa-b5ae-0e0e0ddc9c85",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "ad0ad1ba-04d4-44ee-9ff2-c42438bc5c8e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4],\n",
       "       [ 5,  6,  7,  8,  9],\n",
       "       [10, 11, 12, 13, 14]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.arange(15).reshape(3, 5)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c58464b-f8fe-4e5a-a19c-ccddec2fb5ff",
   "metadata": {},
   "source": [
    "## Items and slices"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "id": "a1f97b6d-94ca-4722-9925-8092edcf6432",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1, 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "8f18a4c0-9e9f-4f9d-b298-15cfade5477d",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3, 4])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "a774c866-6192-45c8-b830-32f81f823a23",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  5, 10])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[:, 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "a8c5368e-57f8-4dd6-b02f-1977a1f1a971",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1],\n",
       "       [5, 6]])"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[0:2, 0:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "904bbc00-6a99-4fa6-ad3d-d1e8c88725b2",
   "metadata": {},
   "source": [
    "What does this do?|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "8f63d3c3-5ecb-4f26-87a7-2e70237383a6",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([], shape=(0, 5), dtype=int64)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[10:1000]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ce57f12-7963-4c2e-96e1-bdfecf624559",
   "metadata": {},
   "source": [
    "## Arrays with different dimensions can be combined via \"broadcasting\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "b0a504f6-1da4-4b35-a38c-2695bc4d5686",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[   0,  100,  200,  300,  400],\n",
       "       [ 500,  600,  700,  800,  900],\n",
       "       [1000, 1100, 1200, 1300, 1400]])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a * 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "66550b98-57ae-43ca-ac57-7f1dd260e83e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  0,  0,  0,  0],\n",
       "       [ 5,  5,  5,  5,  5],\n",
       "       [10, 10, 10, 10, 10]])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a - a[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "432aabd8-e00e-47a0-9c27-5944b0b08a92",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "ename": "ValueError",
     "evalue": "operands could not be broadcast together with shapes (3,5) (3,) ",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[55], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43ma\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m-\u001b[39;49m\u001b[43m \u001b[49m\u001b[43ma\u001b[49m\u001b[43m[\u001b[49m\u001b[43m:\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m  \u001b[38;5;66;03m# nope!\u001b[39;00m\n",
      "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (3,5) (3,) "
     ]
    }
   ],
   "source": [
    "a - a[:, 0]  # nope!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3284862-fceb-4e13-b070-ad79469210f6",
   "metadata": {
    "tags": []
   },
   "source": [
    "Quoting https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html\n",
    "\n",
    "> Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.\n",
    "\n",
    "> Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.\n",
    "\n",
    "> Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "id": "1820c966-3b91-493c-8678-a30c689ab963",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(3, 5)\n",
      "(3,)\n"
     ]
    }
   ],
   "source": [
    "print(a.shape)\n",
    "print(a[:, 0].shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "e14c3e09-25bf-433d-8652-3cda7f8e8244",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  5, 10])"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[:, 0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "9329ff6a-4def-4f39-a0b5-49b153d7ea96",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0],\n",
       "       [ 5],\n",
       "       [10]])"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[:, 0, np.newaxis]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "f13ab434-b175-4c44-b42f-ed7ec764763e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(3, 5)\n",
      "(3, 1)\n"
     ]
    }
   ],
   "source": [
    "print(a.shape)\n",
    "print(a[:, 0, np.newaxis].shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "8400fcbb-5f7f-4200-b982-d9adf2982a9f",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 1, 2, 3, 4],\n",
       "       [0, 1, 2, 3, 4],\n",
       "       [0, 1, 2, 3, 4]])"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a - a[:, 0, np.newaxis]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d008856f-5848-4ae0-9a87-a590ea033818",
   "metadata": {},
   "source": [
    "Slices can be created on their own and reused."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "id": "f4c71fc7-52f6-40cc-b130-fe3ef59fe308",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "b = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])\n",
      "every2 = slice(None, None, 2)\n",
      "every10 = slice(None, None, 10)\n",
      "b[every2] = array([ 0,  2,  4,  6,  8, 10, 12, 14])\n",
      "b[every10] = array([ 0, 10])\n"
     ]
    }
   ],
   "source": [
    "every2 = np.s_[::2]\n",
    "every10 = np.s_[::10]\n",
    "b = np.arange(15)\n",
    "print(f\"{b = }\")\n",
    "print(f\"{every2 = }\")\n",
    "print(f\"{every10 = }\")\n",
    "print(f\"{b[every2] = }\")\n",
    "print(f\"{b[every10] = }\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01dc4801-b160-4978-9c97-d95da729e50d",
   "metadata": {},
   "source": [
    "Great reference on slices in Python in general and multi-dimensional slicing in particular: https://quansight-labs.github.io/ndindex/slices.html"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0be401df-b534-4431-8252-6a748eb44912",
   "metadata": {},
   "source": [
    "## Anatomy of an Array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "4440f9e6-6615-4fd1-a7c0-e05d55e79ed5",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 5)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "9602c680-1557-4299-aeb3-550a24d8eb8a",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.ndim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "add65ea4-b5a3-4304-bc55-ff495cffe25b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "15"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "abb4b64c-3038-47c8-af77-869ca22c7cc3",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.ndim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "f72533b0-49c2-48d8-ae3e-cd0a114737f4",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "120"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.nbytes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "e52af5d9-08ea-434d-a58c-8b9a7f4f9ce7",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int64')"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "82ce2731-56df-494e-832a-a363ee6250ff",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.tolist()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c69dba7a-00a6-40bc-9fb0-c89f0c8ef79f",
   "metadata": {},
   "source": [
    "## Peeking under the hood, just for a moment\n",
    "\n",
    "A block of memory with rules to \"striding\" through it and interpreting it"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5de92b22-3c0d-4f73-bc72-1ebbbf9c95c3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<memory at 0x7f76ff7a9cb0>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "id": "7d5a1e50-00af-4c9f-baa9-3006a926c13e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.dtype.itemsize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "id": "cbc97b8a-2355-4b5a-9f3e-e43ff81583b8",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 5)"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "e58db0bb-b79a-4b13-be85-57e60d29bb9e",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(40, 8)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.strides"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "ab31f56f-b245-4863-9a64-128d686126dd",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0 1 2 3 4]\n",
      "[ 0  5 10]\n"
     ]
    }
   ],
   "source": [
    "print(a[0])\n",
    "print(a[:, 0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "1f282944-ba5a-4a5a-8d8a-8803c8b7e351",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<memory at 0x7f76ff7a9e50>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "5a5f5b09-81b7-4b72-b4c0-c702fcc32e03",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'00000000000000000100000000000000020000000000000003000000000000000400000000000000050000000000000006000000000000000700000000000000080000000000000009000000000000000a000000000000000b000000000000000c000000000000000d000000000000000e00000000000000'"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.data.hex()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4ebcb0a-33f4-46f5-b533-bc7909807876",
   "metadata": {},
   "source": [
    "## Limitations and Coping Strategies\n",
    "\n",
    "* No way to label to dimensions, have to keep track of which is which\n",
    "    * Pass around an object, like a dict, as a key.\n",
    "    * Consider using xarray.\n",
    "    * Resist the temptation to subclass! If you want to go down that general path, look at [Writing custom array containers](https://numpy.org/doc/stable/user/basics.dispatch.html).\n",
    "* No way to include coordinates (\"tick labels\")\n",
    "    * Pass around a simple object, like a dict, containing multiple numpy arrays.\n",
    "    * Consider using xarray.\n",
    "* No built-in support for units\n",
    "    * Use a library like pynt.\n",
    "    * Numpy has added support for custom data types...\n",
    "        * https://numpy.org/neps/nep-0042-new-dtypes.html\n",
    "        * https://github.com/numpy/numpy-user-dtypes\n",
    "    * ...which can be used to implement units!\n",
    "        * https://github.com/seberg/unitdtype\n",
    "    * Numpy's unit support is not \"mainstream\" yet, but it is growing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "980eaa70-9775-4038-a8c3-978d6048d6e0",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}