{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Raise and check a flag array with numpy\n", "> Handle flag array bits with numpy bitwise operations\n", "- categories: [jupyter, numpy]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Often to describe data quality of timelines or images, we use array of integers where each of its bit has a specific meaning, so that we can identify what issues affect each data point.\n", "\n", "For example we have 10 data points, and we assign an array of 8 bits for data quality.\n", "Generally `0` means a good data point, any bit raised is sign of some problem in the data, this is more compressed then using different boolean arrays, and allows to make batch `np.bitwise_and` and `np.bitwise_or` operations." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "flag = np.zeros(10, dtype=np.uint8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The array uses just 8 bits per element" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Variable Type Data/Info\n", "-------------------------------\n", "flag ndarray 10: 10 elems, type `uint8`, 10 bytes\n", "np module kages/numpy/__init__.py'>\n" ] } ], "source": [ "%whos" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Raising a bit seems as easy as adding `2**bit` value to the array,\n", "for example the 4th bit is 16, so:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "flag[2:5] += 2**4" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 0, 16, 16, 16, 0, 0, 0, 0, 0], dtype=uint8)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flag" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The issue is that only works if that bit was `0`, if it was already raised, we would actually zero it and set the higher bit to 1:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "flag[2] += 2**4" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 0, 32, 16, 16, 0, 0, 0, 0, 0], dtype=uint8)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flag" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use bitwise operations\n", "\n", "Fortunately `numpy` supports bitwise operations that make this easier, see the 2 functions below:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def raise_bit_inplace(flag, bit=0):\n", " \"\"\"Raise bit of the flag array in place\n", " \n", " This function modifies the input array,\n", " it also works on slices\n", " \n", " Parameters\n", " ----------\n", " flag : np.array\n", " flag bit-array, generally unsigned integer\n", " bit : int\n", " bit number to raise\n", " \"\"\"\n", " flag[:] = np.bitwise_or(flag, 2**bit)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def raise_bit(flag, bit=0):\n", " \"\"\"Raise bit of the flag array\n", "\n", " Parameters\n", " ----------\n", " flag : np.array\n", " flag bit-array, generally unsigned integer\n", " bit : int\n", " bit number to raise\n", " \n", " Returns\n", " -------\n", " output_flag : np.array\n", " input array with the requested bit raised\n", " \"\"\"\n", " return np.bitwise_or(flag, 2**bit)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def check_bit(flag, bit=0):\n", " \"\"\"Check if bit of the flag array is raised\n", "\n", " The output is a boolean array which could\n", " be used for slicing another array.\n", " \n", " Parameters\n", " ----------\n", " flag : np.array\n", " flag bit-array, generally unsigned integer\n", " bit : int\n", " bit number to check\n", " \n", " Returns\n", " -------\n", " is_raised : bool np.array\n", " True if the bit is raised, False otherwise \n", " \"\"\"\n", " return np.bitwise_and(flag, int(2**bit)) > 0" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "is_bit4_raised = check_bit(flag, bit=4)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, False, True, True, False, False, False, False,\n", " False])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "is_bit4_raised" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "assert np.all(is_bit4_raised[3:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "They also work with slices of an array:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "raise_bit_inplace(flag[6:], bit=1)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 0, 32, 16, 16, 0, 2, 2, 2, 2], dtype=uint8)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flag" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "# Running it twice doesn't change the value of the flag\n", "raise_bit_inplace(flag[6:], bit=1)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 0, 32, 16, 16, 0, 2, 2, 2, 2], dtype=uint8)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flag" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, False, False, False, False, True, True, True,\n", " True])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "check_bit(flag, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First blog post using a Jupyter Notebook with [fastpages](https://github.com/fastai/fastpages#writing-blog-posts-with-jupyter)!!" ] } ], "metadata": { "kernelspec": { "display_name": "PySM 3", "language": "python", "name": "pysm3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 2 }