{ "cells": [ { "cell_type": "markdown", "id": "70c36625", "metadata": {}, "source": [ "--- \n", " \n", "\n", "

Department of Data Science

\n", "

Course: Tools and Techniques for Data Science

\n", "\n", "---\n", "

Instructor: Muhammad Arif Butt, Ph.D.

" ] }, { "cell_type": "markdown", "id": "adc36fed", "metadata": {}, "source": [ "

Lecture 3.4 (NumPy-04)

" ] }, { "cell_type": "markdown", "id": "a70da55c", "metadata": {}, "source": [ "\"Open" ] }, { "cell_type": "markdown", "id": "a7b29f74", "metadata": {}, "source": [ "# _Array Indexing, Subsetting and Slicing.ipynb_" ] }, { "cell_type": "markdown", "id": "10c1fecf", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "349755c2", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "17adef0e", "metadata": {}, "source": [ "# Learning agenda of this notebook\n", "Indexing and Slicing are two of the most common operations that you need to be familiar with when working with Numpy arrays. You will use them when you would like to work with a subset of the array.\n", "1. Indexing NumPy Arrays\n", " - Indexing 1-D NumPy Arrays\n", " - Indexing 2-D NumPy Arrays\n", " - Indexing 3-D NumPy Arrays\n", "2. Slicing NumPy Arrays\n", " - Slicing 1-D NumPy Arrays\n", " - Slicing 2-D NumPy Arrays\n", "3. Boolean Array Indexing\n", " - Boolean Indexing on 1-D NumPy Arrays\n", " - Boolean Indexing on 2-D NumPy Arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "b2f4f561", "metadata": {}, "outputs": [], "source": [ "# To install this library in Jupyter notebook\n", "#import sys\n", "#!{sys.executable} -m pip install numpy" ] }, { "cell_type": "code", "execution_count": null, "id": "dba905d0", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "np.__version__ , np.__path__" ] }, { "cell_type": "code", "execution_count": null, "id": "4825d414", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "0c917a71", "metadata": {}, "source": [ "## 1. Indexing Numpy Arrays\n", "- You can access entire dimension or an individual element of NumPy arrays using indexing.\n", "- The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 and so on.\n", "- You can use negative indixes as well, which starts. from the last element." ] }, { "cell_type": "markdown", "id": "ef47a8ba", "metadata": {}, "source": [ "### a. Indexing 1-D NumPy Arrays\n", "- Along a single axis, integers are used to select single elements, and so-called slices are used to select ranges and sequences of elements. \n", "- Positive integers are used to index elements from the beginning of the array (index starts at 0), and negative integers are used to index elements from the end of the array, where the last element is indexed with –1, the second to last element with –2, and so on." ] }, { "cell_type": "code", "execution_count": 1, "id": "911a526a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original Array \n", " [4 5 6 7 0 2 3] \n", "Array Shape: (7,)\n", "\n", "arr[0] = 4\n", "arr[6] = 3\n", "arr[-1] = 3\n", "arr[-7] = 4\n" ] }, { "ename": "IndexError", "evalue": "index 7 is out of bounds for axis 0 with size 7", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/1t/g3ylw8h50cjdqmk5d6jh1qmm0000gn/T/ipykernel_53717/2576678347.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 15\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"arr[7] = \"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0marr\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m7\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 16\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"arr[-8] = \"\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0marr\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# IndexError\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mIndexError\u001b[0m: index 7 is out of bounds for axis 0 with size 7" ] } ], "source": [ "import numpy as np\n", "mylist = [4, 5, 6, 7, 0, 2, 3]\n", "arr = np.array(mylist, dtype=np.uint8)\n", "\n", "\n", "print(\"Original Array \\n\", arr, \"\\nArray Shape: \",arr.shape)\n", "\n", "# You can access specific elements using positive as well as negative index\n", "print(\"\\narr[0] = \", arr[0])\n", "print(\"arr[6] = \", arr[6])\n", "print(\"arr[-1] = \", arr[-1]) \n", "print(\"arr[-7] = \", arr[-7]) \n", "\n", "\n", "print(\"arr[7] = \", arr[7]) # IndexError\n", "print(\"arr[-8] = \", arr[-8]) # IndexError" ] }, { "cell_type": "code", "execution_count": null, "id": "6223df9f", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "a01d3c7c", "metadata": {}, "source": [ "### b. Indexing 2-D NumPy Arrays" ] }, { "cell_type": "code", "execution_count": 2, "id": "2bb93d8c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original Array \n", " [[1 2 3 4]\n", " [5 6 7 8]\n", " [3 2 4 1]\n", " [7 3 4 9]\n", " [4 0 3 1]] \n", "Array Shape: (5, 4)\n", "Strides: (32, 8)\n", "arr[3][2] = 4\n", "arr[3,2] = 4\n", "arr[3] = [7 3 4 9]\n", "arr[-1][-2] = 3\n", "arr[-1, -2] = 3\n", "arr[-3][-3] = 2\n", "arr[-3, -3] = 2\n", "arr[-3] = [3 2 4 1]\n", "arr[-1] = [4 0 3 1]\n" ] } ], "source": [ "import numpy as np\n", "mylist = [\n", " [1, 2, 3, 4], \n", " [5, 6, 7, 8], \n", " [3, 2, 4, 1], \n", " [7, 3, 4, 9], \n", " [4, 0, 3, 1]\n", " ]\n", "arr = np.array(mylist)\n", "print(\"Original Array \\n\", arr, \"\\nArray Shape: \",arr.shape)\n", "print(\"Strides: \", arr.strides)\n", "\n", "# You can access specific elements\n", "print(\"arr[3][2] = \", arr[3][2])\n", "print(\"arr[3,2] = \", arr[3,2])\n", "\n", "# You can access entire rows\n", "print(\"arr[3] = \", arr[3])\n", "\n", "# Negative indexing\n", "print(\"arr[-1][-2] = \", arr[-1][-2]) \n", "print(\"arr[-1, -2] = \", arr[-1, -2]) \n", "print(\"arr[-3][-3] = \", arr[-3][-3]) \n", "print(\"arr[-3, -3] = \", arr[-3, -3]) \n", "print(\"arr[-3] = \", arr[-3]) \n", "print(\"arr[-1] = \", arr[-1]) \n" ] }, { "cell_type": "code", "execution_count": null, "id": "ebacb660", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "f961b6c9", "metadata": {}, "source": [ "### c. Indexing 3-D NumPy Arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "4c9248b2", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "mylist = [\n", " [[1, 2, 3], [5, 6, 7], [9, 0, 5]],\n", " [[8, 1, 6], [1, 9, 4], [5, 8, 2]],\n", " ]\n", "arr = np.array(mylist)\n", "print(\"Original Array:\\n\", arr)\n", "print(\"Array Shape = \",arr.shape)\n", "print(\"Strides:\", arr.strides)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "16b00adf", "metadata": {}, "outputs": [], "source": [ "# You can access 2-D matrix at first level\n", "print(\"\\narr[0]: \\n\", arr[0])\n", "# You can access 2-D matrix at second level\n", "print(\"arr[1]: \\n\", arr[1])" ] }, { "cell_type": "code", "execution_count": null, "id": "3e341bf9", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array:\\n\", arr)\n", "print(\"Array Shape = \",arr.shape)\n", "\n", "# You can access a specific row at a specific level\n", "print(\"\\narr[0][1]: \", arr[0][1])\n", "print(\"\\narr[0, 1]: \", arr[0, 1])\n", "print(\"arr[1][2]: \", arr[1][2])\n", "print(\"arr[1, 2]: \", arr[1, 2])" ] }, { "cell_type": "code", "execution_count": null, "id": "44ecbd67", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "06f2cef4", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array:\\n\", arr)\n", "print(\"Array Shape = \",arr.shape)\n", "\n", "# You can access a specific element\n", "print(\"\\narr[0][1][2]: \", arr[0][1][2])\n", "print(\"arr[1][2][1]: \", arr[1][2][1])\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3b049664", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "7c6f172d", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array:\\n\", arr)\n", "print(\"Array Shape = \",arr.shape)\n", "\n", "# Negative indexing\n", "print(\"\\narr[-1][-2][-1]: \", arr[-1][-2][-1]) \n", "print(\"arr[-2][2]: \", arr[-2][2]) " ] }, { "cell_type": "code", "execution_count": null, "id": "6c40fe29", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "f37d31f0", "metadata": {}, "source": [ "## 2. Slicing NumPy Arrays\n", "- You can slice a numpy array in a similar fashion as we have sliced Python Lists, with two differences:\n", " - The difference is that NumPy arrays can be sliced in more than one dimension.\n", " - The other difference is that, when we slice a Python list we get a completely new list, while in case of numPy arrays, you get a **view** of the original array, which is just a way of accessing array data. Thus the original array is not copied in memory.\n", "- An array can be sliced using `:` symbol, which returns the range of elements specified by the index numbers.\n", "- There are three arguments for slicing arrays, all are optional:\n", "```\n", "array[[start]:[stop][:step]]\n", "```\n", "\n", " - start: specifies from where the slicing should start, inclusive (default is 0) \n", " - stop: specifies where it has to stop, exclusive (default is end of the array) \n", " - step: is by-default 1\n", " \n", "Note: Subarrays that are extracted from arrays using slice operations are alternative views of the same underlying array data. This means that they are arrays that refer to the same data in memory as the original array, but with a different strides configuration." ] }, { "cell_type": "code", "execution_count": null, "id": "e4a6cef9", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "adc5e316", "metadata": {}, "source": [ "### a. Slicing 1-D Arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "8cab2f63", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "mylist = [4, 5, 6, 7, 0, 2, 3]\n", "arr = np.array(mylist, dtype=np.uint8)\n", "print(\"Original Array \\n\", arr, \"\\nArray Shape = \",arr.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "bdc02f73", "metadata": {}, "outputs": [], "source": [ "print(\"\\narr[:] = \", arr[:])\n", "print(\"arr[3:] = \", arr[3:])\n", "print(\"arr[:4] = \", arr[:4])\n", "print(\"arr[2:5] = \", arr[2:5])\n", "\n", "print(\"arr[:-2] = \", arr[:-2])\n", "print(\"arr[-1:] = \", arr[-1:])\n", "print(\"arr[-1:-4] = \", arr[-1:-4])\n", "print(\"arr[-1:-4:-1] = \", arr[-1:-4:-1])\n", "\n", "# reverse the array using step value as -1\n", "print(\"arr[::-1] = \", arr[::-1])\n", "print(\"\\nAfter all this arr is same: \", arr)" ] }, { "cell_type": "markdown", "id": "d23838cb", "metadata": {}, "source": [ "**Proof of Concept:** Slice of a Python List Returns a New List" ] }, { "cell_type": "code", "execution_count": null, "id": "64cd9967", "metadata": {}, "outputs": [], "source": [ "list1 = [1, 2, 3, 4, 5, 6, 7,8, 9]\n", "list1" ] }, { "cell_type": "code", "execution_count": null, "id": "c3c4d4a8", "metadata": {}, "outputs": [], "source": [ "# A new list object is created after slicing\n", "list2 = list1[2:5]\n", "list2" ] }, { "cell_type": "code", "execution_count": null, "id": "4df7ddc8", "metadata": {}, "outputs": [], "source": [ "#If we make change to this list, it do not effect the original list\n", "list2[0] = 99" ] }, { "cell_type": "code", "execution_count": null, "id": "559d3ad8", "metadata": {}, "outputs": [], "source": [ "list1" ] }, { "cell_type": "code", "execution_count": null, "id": "48131b6b", "metadata": {}, "outputs": [], "source": [ "list2" ] }, { "cell_type": "code", "execution_count": null, "id": "42e33d84", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "cbcaecda", "metadata": {}, "source": [ "**Proof of Concept:** Slice of a numPy array returns a **view** of original numPy array" ] }, { "cell_type": "code", "execution_count": null, "id": "26adaf89", "metadata": {}, "outputs": [], "source": [ "list1 = [1, 2, 3, 4, 5, 6, 7,8, 9]\n", "arr1 = np.array(list1)\n", "arr1" ] }, { "cell_type": "code", "execution_count": null, "id": "0810978f", "metadata": {}, "outputs": [], "source": [ "# A view of the original numPy array is created after slicing\n", "arr2 = arr1[2:5]\n", "arr2" ] }, { "cell_type": "code", "execution_count": null, "id": "63fc96e4", "metadata": {}, "outputs": [], "source": [ "# If we make change to this view, it will ofcourse effect the original array\n", "arr2[0] = 99" ] }, { "cell_type": "code", "execution_count": null, "id": "572208e0", "metadata": {}, "outputs": [], "source": [ "arr1" ] }, { "cell_type": "code", "execution_count": null, "id": "060b5093", "metadata": {}, "outputs": [], "source": [ "arr2, arr2.strides" ] }, { "cell_type": "markdown", "id": "4c18349f", "metadata": {}, "source": [ "### b. Slicing 2-D Arrays\n", "- Slicing a two-dimensional array is very similar to slicing a one-dimensional array. You just use a comma to separate the row slice and the column slice.\n", "- Numpy extends Python's list indexing notation using `[]` to multiple dimensions in an intuitive fashion. You can provide a comma-separated list of indices or ranges to select a specific element or a subarray (also called a slice) from a Numpy array." ] }, { "cell_type": "code", "execution_count": null, "id": "f84dbff5", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "mylist = [\n", " [1, 2, 3, 4], \n", " [5, 6, 7, 8], \n", " [3, 2, 4, 1], \n", " [7, 3, 4, 9], \n", " [4, 0, 3, 1]\n", " ]\n", "arr = np.array(mylist)\n", "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)\n", "\n", "# Note we have two slice objects (row slice and column slice) in case of 2-D slicing separated by a comma\n", "print(\"\\narr[:,:] = \\n\", arr[:,:])" ] }, { "cell_type": "code", "execution_count": null, "id": "ae493326", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "c1796549", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)\n", "\n", "# Get the row at index 2\n", "print(\"arr[2] = \", arr[2]) # you can ignore the : symbol for the column part\n", "print(\"arr[2,:]= \", arr[2,:]) # for better readability give comma separated two values" ] }, { "cell_type": "code", "execution_count": null, "id": "3c6a2f5d", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "3a2b2662", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)\n", "\n", "# Get the column at index 1 (Get all the row values of column at index 1)\n", "print(\"arr[:, 1] = \", arr[:, 1])" ] }, { "cell_type": "code", "execution_count": null, "id": "b5bc82b0", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "19e6c4a7", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "8600e00e", "metadata": {}, "outputs": [], "source": [ "# From the row at index 1, slice elements from index 1 to index 2\n", "print(\"\\narr[1, 1:3] = \",arr[1, 1:3])" ] }, { "cell_type": "code", "execution_count": null, "id": "e59011e4", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "6160e921", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)\n", "\n", "# Reversing elements of all rows is tricky. \n", "# Read it as \"select all the rows, and in each row reverse all the column values\n", "print(\"\\narr[:, ::-1] =\\n\", arr[:, ::-1])" ] }, { "cell_type": "code", "execution_count": null, "id": "0fc157b7", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "2f336be6", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "da8c9ce5", "metadata": {}, "outputs": [], "source": [ "print(\"Original Array \\n\", arr, \"\\nArray Size = \",arr.shape)\n", "\n", "# Reverse elements of all columns is also tricky. Actually you want to reverse the elements of all the rows\n", "# Read it as \"select all the columns, and in each column reverse all the row values\n", "print(\"\\narr[::-1, :] =\\n\", arr[::-1, :])" ] }, { "cell_type": "code", "execution_count": null, "id": "af020e1b", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "6ae17d51", "metadata": {}, "source": [ ">- **Slicing 3-D arrays can be performed in the same fashion. The only difference is that we have three comma separated slice objects instead of two and it is a bit tricky to visualize :)**" ] }, { "cell_type": "code", "execution_count": null, "id": "dd91a371", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 17, "id": "4a85264a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "[7 3 4 9]\n", "3\n", "2\n", "[3 2 4 1]\n", "[2 4]\n", "[[4 0 3 1]\n", " [7 3 4 9]\n", " [3 2 4 1]\n", " [5 6 7 8]\n", " [1 2 3 4]]\n", "[[3 4]\n", " [4 3]]\n" ] } ], "source": [ "arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [3, 2, 4, 1], [7, 3, 4, 9], [4, 0, 3, 1] ])\n", "print(arr[3,2])\n", "print(arr[3])\n", "print(arr[-1, -2]) \n", "print(arr[-3][-3]) \n", "print(arr[-3]) \n", "print(arr[2, 1:3])\n", "print(arr[::-1, :])\n", "print(arr[2::2, 0::2])" ] }, { "cell_type": "code", "execution_count": null, "id": "7316bc7e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "c723ba59", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "016d8261", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "8d61858c", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "763ff575", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "77f1c6bb", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "8bdfb1cb", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "2b77777e", "metadata": {}, "source": [ "## 3. Boolean/Fancy Array Indexing\n", "\n", "- NumPy provides another convenient method to index arrays, called fancy indexing. With fancy indexing, an array can be indexed with another NumPy array, a Python list, or a sequence of integers, whose values select elements in the indexed array. Fancy indexing requires that the elements in the array or list used for indexing are integers.\n", "\n", "- Boolean indexing is an extremely intuitive and elegant way of selecting contents from a numPy array based on logical conditions.\n", "- In simple words, we can slice NumPy arrays by either:\n", " - Provide a condition inside the `[]` operator\n", " - Provide a Boolean mask corresponding to indexes in the array.\n", "- If the value at an index of that list is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.\n", "\n", ">- **Note:** Unlike normal slicing which creates a view, Boolean/Fancy Indexing creates a copy of numPy array.\n", "\n", ">- **Note:** The Python keywords `and` and `or` do not work with boolean arrays. Use `&` and `|` instead" ] }, { "cell_type": "markdown", "id": "219e725e", "metadata": {}, "source": [ "### a. Boolean/Fancy Indexing on 1-D Arrays" ] }, { "cell_type": "code", "execution_count": null, "id": "ad6d115f", "metadata": {}, "outputs": [], "source": [ "# Fancy Indexing\n", "import numpy as np\n", "# creating 1-D array of size 10 of int type b/w interval (1,100) \n", "arr = np.random.randint(1, 101, 10)\n", "print(\"Original Array: \", arr)\n", "print(arr[np.array([0, 2, 4, 9])])\n", "print(arr[[0, 2, 4, 9]])" ] }, { "cell_type": "code", "execution_count": null, "id": "c4c67184", "metadata": {}, "outputs": [], "source": [ "# Boolean Indexing\n", "import numpy as np\n", "# creating 1-D array of size 10 of int type b/w interval (1,100) \n", "arr = np.random.randint(1, 101, 10)\n", "print(\"Original Array: \", arr)\n", "print(arr > 50)\n", "print(arr[arr>50])" ] }, { "cell_type": "code", "execution_count": null, "id": "e38a7684", "metadata": {}, "outputs": [], "source": [ "# Boolean Indexing\n", "import numpy as np\n", "# creating 1-D array of size 10 of int type b/w interval (1,100) \n", "arr = np.random.randint(1, 101, 10)\n", "print(\"Original Array: \", arr)\n", "\n", "# Getting even values from array\n", "print(\"\\narr[arr%2 == 0] = \",arr[arr%2==0])\n", "# Getting odd values from array\n", "print(\"arr[arr%2 == 1] = \",arr[arr%2==1])\n", "\n", "# Getting values greater than 50\n", "print(\"\\narr[arr > 50] = \",arr[arr > 50])\n", "# Getting values between 25 and 75 both exclusive\n", "print(\"arr[arr>30 & arr<60] = \",arr[(arr>30) & (arr<60)])" ] }, { "cell_type": "code", "execution_count": null, "id": "a9681b3f", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "ebabd453", "metadata": {}, "outputs": [], "source": [ "# We can use a mask instead of mentioning a condition as done above\n", "import numpy as np\n", "arr1 = np.array([1, 2, 3, 4, 5])\n", "print(\"Original Array: \", arr1)\n", "\n", "mask=[False, False, False, True, False]\n", "arr2 = arr1[mask]\n", "\n", "print(\"Filtered Array: \", arr2) " ] }, { "cell_type": "code", "execution_count": null, "id": "0aeaaeb9", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "76bb8d63", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "e605bad3", "metadata": {}, "source": [ "### b. Boolean/Fancy Indexing on 2-D Arrays" ] }, { "cell_type": "markdown", "id": "fe68d235", "metadata": {}, "source": [ "**Example 1:**" ] }, { "cell_type": "code", "execution_count": null, "id": "d40fb53e", "metadata": {}, "outputs": [], "source": [ "# By passing a tuple to size means rows and columns\n", "matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))\n", "print(\"Original matrix \\n\", matrix1, \"\\nMatrix Shape = \",matrix1.shape)" ] }, { "cell_type": "markdown", "id": "2830fcfe", "metadata": {}, "source": [ ">Suppose we want a new matrix from above matrix, that only contains rows at index 0, 2, and 3" ] }, { "cell_type": "code", "execution_count": null, "id": "c60b5a89", "metadata": {}, "outputs": [], "source": [ "# Create a corresponding boolean mask\n", "rows_wanted = np.array( [True, False, True, True, False] )\n", "\n", "matrix2 = matrix1[rows_wanted, :]\n", "print(\"\\nFiltered matrix \\n\", matrix2, \"\\nMatrix Shape = \",matrix2.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "3ce98f44", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "b1291ea3", "metadata": {}, "source": [ "**Example 2:**" ] }, { "cell_type": "code", "execution_count": null, "id": "f46e86d8", "metadata": {}, "outputs": [], "source": [ "# By passing a tuple to size means rows and columns\n", "matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))\n", "print(\"Original matrix \\n\", matrix1, \"\\nMatrix Shape = \",matrix1.shape)" ] }, { "cell_type": "markdown", "id": "0817be5b", "metadata": {}, "source": [ ">Suppose we want a new matrix from above matrix, that only contains columns at index 1 and 3 only" ] }, { "cell_type": "code", "execution_count": null, "id": "64352e64", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "5ecceab4", "metadata": {}, "outputs": [], "source": [ "# Create a corresponding boolean mask\n", "cols_wanted = np.array( [False, True, False, True, False] )\n", "\n", "matrix2 = matrix1[ : , cols_wanted]\n", "print(\"\\nFiltered matrix \\n\", matrix2, \"\\nMatrix Shape = \",matrix2.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "1f9739fe", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "e760d793", "metadata": {}, "source": [ "**Example 3:**" ] }, { "cell_type": "code", "execution_count": 1, "id": "072e6ae7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Original matrix \n", " [[8 5 1 9 1]\n", " [7 2 4 4 2]\n", " [3 5 5 6 1]\n", " [9 4 2 6 9]\n", " [5 5 5 8 2]] \n", "Matrix Shape = (5, 5)\n" ] } ], "source": [ "import numpy as np\n", "# By passing a tuple to size means rows and columns\n", "matrix1 = np.random.randint(low = 1, high = 10, size = (5,5))\n", "print(\"Original matrix \\n\", matrix1, \"\\nMatrix Shape = \",matrix1.shape)" ] }, { "cell_type": "markdown", "id": "2e5fcfae", "metadata": {}, "source": [ ">Suppose we want a new matrix from above 5x5 matrix, that contains inner 3x3 matrix" ] }, { "cell_type": "code", "execution_count": null, "id": "4ce79546", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 2, "id": "a35d1471", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Filtered matrix \n", " [[7 2 4 4 2]\n", " [3 5 5 6 1]\n", " [9 4 2 6 9]] \n", "Matrix Shape = (3, 5)\n" ] } ], "source": [ "rows_wanted = np.array( [False, True, True, True, False] )\n", "\n", "matrix2 = matrix1[ rows_wanted , :]\n", "print(\"\\nFiltered matrix \\n\", matrix2, \"\\nMatrix Shape = \",matrix2.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "151e09a2", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 3, "id": "b6e7b5e8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Filtered matrix \n", " [[2 4 4]\n", " [5 5 6]\n", " [4 2 6]] \n", "Matrix Shape = (3, 3)\n" ] } ], "source": [ "cols_wanted = np.array( [False, True, True, True, False] )\n", "\n", "matrix3 = matrix2[ :, cols_wanted]\n", "print(\"\\nFiltered matrix \\n\", matrix3, \"\\nMatrix Shape = \",matrix3.shape)" ] }, { "cell_type": "markdown", "id": "e180949d", "metadata": {}, "source": [ "**Can you think of doing this in one step or in a more elegant fashion?**" ] }, { "cell_type": "code", "execution_count": 4, "id": "95fd61b0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Filtered matrix \n", " [[2 4 4]\n", " [5 5 6]\n", " [4 2 6]] \n", "Matrix Shape = (3, 3)\n" ] } ], "source": [ "matrix3 = matrix1[1:4, 1:4]\n", "print(\"\\nFiltered matrix \\n\", matrix3, \"\\nMatrix Shape = \",matrix3.shape)" ] }, { "cell_type": "code", "execution_count": 6, "id": "19393f4a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 4, 4],\n", " [5, 5, 6],\n", " [4, 2, 6]])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix2 = matrix1[ 1 : -1, 1 : -1 ]\n", "matrix2" ] }, { "cell_type": "markdown", "id": "cd24da4c", "metadata": {}, "source": [ "# Four Slicing Problems with Solutions" ] }, { "cell_type": "markdown", "id": "64383fd6", "metadata": {}, "source": [ " " ] }, { "cell_type": "code", "execution_count": null, "id": "30658582", "metadata": {}, "outputs": [], "source": [ "m1 = np.array([\n", " [0,1,2,3,4,5],\n", " [6,7,8,9,10,11],\n", " [12,13,14,15,16,17],\n", " [18,19,20,21,22,23],\n", " [24,25,26,27,28,29],\n", " [30,31,32,33,34,35]\n", "])" ] }, { "cell_type": "code", "execution_count": null, "id": "86824cb5", "metadata": {}, "outputs": [], "source": [ "m1" ] }, { "cell_type": "code", "execution_count": null, "id": "686749a8", "metadata": {}, "outputs": [], "source": [ "m1[0,3:5]" ] }, { "cell_type": "code", "execution_count": null, "id": "c549f90b", "metadata": {}, "outputs": [], "source": [ "m1[4:, 4:]" ] }, { "cell_type": "code", "execution_count": null, "id": "5d377ebd", "metadata": {}, "outputs": [], "source": [ "m2 = m1[:,2]" ] }, { "cell_type": "code", "execution_count": null, "id": "71fe3374", "metadata": {}, "outputs": [], "source": [ "m2.strides" ] }, { "cell_type": "code", "execution_count": null, "id": "8c2119c6", "metadata": {}, "outputs": [], "source": [ "m1[2::2, 0::2]" ] }, { "cell_type": "code", "execution_count": null, "id": "4dc684ff", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 5 }