{ "cells": [ { "cell_type": "markdown", "id": "68ff9f10-1223-4f09-a504-8bbf8ca2495c", "metadata": {}, "source": [ "# Flagging\n", "\n", "This notebook shows how to use flags in dysh, as well as reminding on the concept of flagging vs. masking.\n", "\n", "In dysh there are four sources of flags: a .flag file, user generated flags, flags generated from the SDFITS metadata (at present these only flag VEGAS channels known to be bad, i.e. [VEGAS spurs](https://dysh.readthedocs.io/en/latest/reference/glossary.html#term-VEGAS-spurs)), and the FLAGS column of the SDFITS binary table. The GBT does not produce data with a FLAGS column; however, dysh will write the FLAGS column if using ``GBTFITSLoad.write(flags=True)`` which is the default.\n", "\n", "* Reading flags from a .flag file is controlled by the ``skipflags`` argument of ``GBTFITSLoad``. By default this is set to False, as the default flags in a .flag file are VEGAS spurs, and those get flagged instead by looking at the SDFITS metadata. If you have generated flags in a .flag file, it must have the same name as the SDFITS file that it applies to but instead of ending in .fits it must end in .flag, and you must set ``skipflags=False`` when calling ``GBTFITSLoad``. Lines in a flag file with IDSTRING ``VEGAS_SPUR`` are ignored by default, even if ``skipflags==True``. \n", "\n", "* Flags generated from the metadata are applied during the calibration process, e.g., when calling ``GBTFITSLoad.getsigref``. For any of the calibration routines, the flagging of VEGAS spurs is controlled by the keyword ``flag_vegas``. By default it is set to True, meaning that VEGAS spurs will be flagged. Once data is calibrated and VEGAS spur flags generated, these can be reverted in two ways. The first is to ignore all flags during calibration using ``apply_flags=False``, the second is to clear all flags using ``GBTFITSLoad.clear_flags`` and repeat the calibration using ``flag_vegas=False``.\n", "\n", "* Flags in the FLAGS column are always read in if present and will be applied in a calibration method if ``apply_flags=True`` or if you call ``GBTFITSLoad.apply_flags``. These flags can be cleared with ``GBTFITSLoad.clear_flags``.\n", "\n", "\n", "This notebook will explain how to generate your own flags using dysh, and show some examples of the two points above.\n", "\n", "## Dysh commands\n", "\n", "The following dysh commands are introduced (leaving out all the function arguments):\n", "\n", " filename = dysh_data()\n", " sdf = GBTFITSLoad()\n", " sdf.flags.show()\n", " sdf.flags.clear()\n", " sdf.flag()\n", " sdf.clear_flags()\n", " sb = sdf.getps()\n", " sb.plot()\n", " sb.plot().write()\n", "\n", "## Loading Modules\n", "We start by loading the modules we will use for this example. \n", "\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "307976bc-facc-4066-a51b-baa376f1c3aa", "metadata": {}, "outputs": [], "source": [ "# These modules are required for working with the data.\n", "from dysh.fits.gbtfitsload import GBTFITSLoad\n", "from dysh.util.selection import Selection\n", "from dysh.log import init_logging\n", "import numpy as np\n", "\n", "# These modules are used for file I/O\n", "from dysh.util.files import dysh_data\n", "from pathlib import Path\n", "import tarfile\n", "\n", "# We also do some matplotlib here\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "3af012ae-ded1-4fad-a7a2-c845d52d2ed7", "metadata": {}, "source": [ "## Setup\n", "dysh uses a logger to communicate. If you are working in the command\n", "line, then the logging is setup for you. If you are working in a\n", "jupyter lab instance, then you need to set it up. You can do so using\n", "the init_logging function imported above. As an argument, init_logging\n", "takes a number, the verbosity level. level 0 is for error messages\n", "only, 1 for warning, 2 for info and 3 for debug. Here we set it to\n", "level " ] }, { "cell_type": "code", "execution_count": 3, "id": "cfd7d832-4ca9-450e-a296-de2bdcd6f125", "metadata": {}, "outputs": [], "source": [ "init_logging(2)\n", "\n", "# also create a local \"output\" directory where temporary notebook files can be stored.\n", "output_dir = Path.cwd() / \"output\"\n", "output_dir.mkdir(exist_ok=True)" ] }, { "cell_type": "markdown", "id": "c636f392-8911-4680-b162-9c9c5898325c", "metadata": {}, "source": [ "## Data Retrieval\n", "\n", "This time we download the data from a tar.gz file and then unpack it locally in the current \"data\" directory." ] }, { "cell_type": "code", "execution_count": 3, "id": "b3f37b50-3d5b-411f-bab7-913ccef466be", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "12:12:57.766 I url: http://www.gb.nrao.edu/dysh//example_data/rfi-L/data/AGBT17A_404_01.tar.gz\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "AGBT17A_404_01.tar.gz already downloaded\n" ] } ], "source": [ "filename = dysh_data(example=\"rfi-L/data/AGBT17A_404_01.tar.gz\")" ] }, { "cell_type": "code", "execution_count": 4, "id": "1274c6fa-c49f-4e9f-a6f1-b6bcbe7d6981", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/lib64/python3.11/tarfile.py:2303: RuntimeWarning: The default behavior of tarfile extraction has been changed to disallow common exploits (including CVE-2007-4559). By default, absolute/parent paths are disallowed and some mode bits are cleared. See https://access.redhat.com/articles/7004769 for more details.\n", " warnings.warn(\n" ] } ], "source": [ "# Unpack.\n", "with tarfile.open(filename) as targz:\n", " targz.extractall('./data/') \n", " targz.close() " ] }, { "cell_type": "markdown", "id": "378da2ce-3f5c-44fe-a692-23cda7afe9cd", "metadata": {}, "source": [ "## Data Loading\n", "\n", "After unpacking the data we load it. Notice how `dysh` tells us that it found an empty .flag file." ] }, { "cell_type": "code", "execution_count": 5, "id": "feb1f1dd-0f13-4ad7-a015-13bae16f4252", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/export/home/fornax/psalas/trunk/dysh/src/dysh/util/selection.py:1377: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access\n", " self._flag_file_rep = []\n", "12:12:57.906 W No flag rules found in file data/AGBT17A_404_01.raw.vegas/AGBT17A_404_01.raw.vegas.A.flag\n", "12:12:57.914 I Index loaded from .index file (44/93 columns). Missing columns (TCAL, WCS, calibration metadata, etc.) will be automatically loaded from FITS file when first accessed.\n" ] } ], "source": [ "sdfits = GBTFITSLoad(\"./data/AGBT17A_404_01.raw.vegas\", skipflags=False)" ] }, { "cell_type": "markdown", "id": "92293746-91ea-47f6-b276-c0ade5738184", "metadata": {}, "source": [ "What flags were loaded?" ] }, { "cell_type": "code", "execution_count": 6, "id": "e950c32c-d645-4653-838a-85b161324a73", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ID TAG OBJECT BANDWID DATE-OBS ... SUBOBSMODE FITSINDEX CHAN UTC # SELECTED\n", "--- --- ------ ------- -------- ... ---------- --------- ---- --- ----------\n" ] } ], "source": [ "sdfits.flags.show()" ] }, { "cell_type": "markdown", "id": "207dbfad-a4c0-4f48-befc-865590049922", "metadata": {}, "source": [ "The above shows that the .flag file was empty, so no flags were loaded.\n", "\n", "Now, lets look at the summary." ] }, { "cell_type": "code", "execution_count": 7, "id": "efd9571f-a01e-4756-9164-a91af211b930", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| SCAN | \n", "OBJECT | \n", "VELOCITY | \n", "PROC | \n", "PROCSEQN | \n", "RESTFREQ | \n", "# IF | \n", "# POL | \n", "# INT | \n", "# FEED | \n", "AZIMUTH | \n", "ELEVATION | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|
| 19 | \n", "A123606 | \n", "6600.0 | \n", "OnOff | \n", "1 | \n", "1.420406 | \n", "1 | \n", "2 | \n", "61 | \n", "1 | \n", "64.5805 | \n", "48.3795 | \n", "
| 20 | \n", "A123606 | \n", "6600.0 | \n", "OnOff | \n", "2 | \n", "1.420406 | \n", "1 | \n", "2 | \n", "61 | \n", "1 | \n", "64.6012 | \n", "48.4338 | \n", "