{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DC2: Generate Postage Stamps (Cutouts) for objects in the Object Catalog\n", "\n", "Owner: **Yao-Yuan Mao** ([@yymao](https://github.com/LSSTDESC/DC2-analysis/issues/new?body=@yymao))\n", "
Last Verified to Run: **2020-11-30** (by @yymao)\n", "\n", "This notebook is partly based on the `dm_butler_postage_stamps` notebook by Michael Wood-Vasey and the Stack Club `ButlerTutorial` by Daniel Perrefort.\n", "\n", "In this notebook, we will first obtain a list of RA, Dec positions from the Object catalog using [GCRCatalogs](https://github.com/LSSTDESC/gcr-catalogs/), and then generate \"postage-stamp\" cutout images from the coadded images. \n", "\n", "### Learning Objectives:\n", "After working through and studying this Notebook you should be able to\n", " 1. Find the corresponding tracts and patches for a given list of RA and Dec\n", " 2. Generate a postage stamp from a coadd for your chosen RA, Dec in a chosen filter, using matplotlib\n", " 3. Obtain cutout images in two different ways ( `.getCutout()` and `butler.get(bbox=..)` ) \n", " 4. Generate false color RGB images\n", "\n", "### Logistics\n", "This is intended to be runnable at NERSC through the https://jupyter.nersc.gov interface from a local git clone of https://github.com/LSSTDESC/DC2-analysis in your NERSC directory. But you can also run it wherever, with appropriate adjustment of the 'repo' location to point to a place where you have a Butler repo will all of the images. \n", "\n", "This notebook uses the `desc-stack-weekly-latest` kernel. Instructions for setting up the proper DESC python kernel can be found here: https://confluence.slac.stanford.edu/x/o5DVE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set up\n", "\n", "First we will load the needed modules and DC2 DR6 data sets: object catalogs (with `GCRCatalogs`) and DRP products (with `desc_dc2_dm_data`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A few common packages\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "%matplotlib inline\n", "\n", "# We will use astropy's WCS and ZScaleInterval for plotting\n", "from astropy.wcs import WCS\n", "from astropy.visualization import ZScaleInterval\n", "\n", "# We will use several stack functions\n", "import lsst.geom\n", "import lsst.afw.display as afwDisplay\n", "import lsst.afw.display.rgb as rgb\n", "\n", "# And also DESC packages to get the data path\n", "import GCRCatalogs\n", "from GCRCatalogs import GCRQuery\n", "import desc_dc2_dm_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are on a custom kernel (i.e., not using a DESC kernel), and you do have stack installed, but not those DESC packages.\n", "You can install them by uncomment and running the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Uncomment and run this cell *only if* you are on a custom kernel that has stack installed, but not the DESC packages \n", "\n", "#%pip install https://github.com/LSSTDESC/gcr-catalogs/archive/master.zip\n", "#%pip install https://github.com/LSSTDESC/desc-dc2-dm-data/archive/master.zip" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sorted(desc_dc2_dm_data.REPOS)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be using the DC2 Run 2.2i DR6 WFD data. Read more about this data set here: https://arxiv.org/abs/2010.05926" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dc2_data_version = \"2.2i_dr6_wfd\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "GCRCatalogs.get_available_catalogs(names_only=True, name_contains=dc2_data_version)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cat = GCRCatalogs.load_catalog(\"dc2_object_run\"+dc2_data_version)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "butler = desc_dc2_dm_data.get_butler(dc2_data_version)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get a list of interesting objects\n", "\n", "Here we will find some brightest galaxies in the object catalog to make cutout images for them! \n", "\n", "To learn what columns are in the object catalogs, refer to [this schema table](https://github.com/LSSTDESC/gcr-catalogs/blob/master/GCRCatalogs/SCHEMA.md#schema-for-dc2-object-catalogs). And sometimes it'd be helpful to look at the [source code](https://github.com/LSSTDESC/gcr-catalogs/blob/master/GCRCatalogs/dc2_object.py#L341)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bright_galaxy_query = GCRQuery(\n", " \"clean\",\n", " \"extendedness == 1\",\n", " \"mag_r_cModel < 16\",\n", " \"snr_g_cModel > 10\",\n", " \"snr_r_cModel > 10\",\n", " \"snr_i_cModel > 10\",\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "columns_to_get = [\"objectId\", \"ra\", \"dec\", \"mag_r_cModel\", \"tract\", \"patch\"]\n", "assert cat.has_quantities(columns_to_get)\n", "\n", "# Here we use native_filters to limit to tract == 4639 to save some load time\n", "\n", "objects = cat.get_quantities(columns_to_get, filters=bright_galaxy_query, native_filters=\"tract == 4639\")\n", "objects # get_quantities returns an ordinary python dictionary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# make it a pandas data frame for the ease of manipulation\n", "objects = pd.DataFrame(objects)\n", "objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that in the object catalog there's already tract and patch information. \n", "\n", "## What are tracts and patches?\n", "\n", "The coadds produced by the DM stack are structured in terms of large `tracts` and smaller `patches`, illustrated here for DC2 Run2.2i (left panel), which covers 300 square degrees and has 165 tracts. The right panel shows a zoom-in version of the upper right corner, where you can see the patch structure. \n", "\n", "![DC2 Run 2 Sky Map](assets/dc2_skymap_run2.png)\n", "\n", "### How do I find tract/patch for given RA/Dec?\n", "\n", "What if tract/patch information were not available in the catlaog? We will then need to load the skymap, which stores such information. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "skymap = butler.get('deepCoadd_skyMap')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# find tract and patch for the 0th object: \n", "\n", "object_this = objects.loc[0]\n", "\n", "radec = lsst.geom.SpherePoint(object_this[\"ra\"], object_this[\"dec\"], lsst.geom.degrees)\n", "tractInfo = skymap.findTract(radec)\n", "patchInfo = tractInfo.findPatch(radec)\n", "\n", "print(\"tract =\", tractInfo.getId(), \"; patch =\", patchInfo.getIndex())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's now check the tract/patch values for all objects in our table are indeed consistent with what sky map tells us. \n", "\n", "This step is not really necessary. \n", "However, if you have a list of RA/Dec but not tract and patch information \n", "(for example, if you want to generate postage stamps for a list of galaxies from cosmoDC2),\n", "the cell below would be useful! " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_tract_patch(ra, dec, units=lsst.geom.degrees):\n", " radec = lsst.geom.SpherePoint(ra, dec, units)\n", " tractInfo = skymap.findTract(radec)\n", " patchInfo = tractInfo.findPatch(radec)\n", " return tractInfo.getId(), \"{},{}\".format(*patchInfo.getIndex())\n", "\n", "\n", "tract_patch = objects.apply(lambda row: get_tract_patch(row[\"ra\"], row[\"dec\"]), axis=1, result_type='expand') \\\n", " .rename(columns={0: \"tract\", 1: \"patch\"})\n", "\n", "assert (objects[\"tract\"] == tract_patch[\"tract\"]).all()\n", "assert (objects[\"patch\"] == tract_patch[\"patch\"]).all()\n", "\n", "tract_patch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load coadd images using butler\n", "\n", "Once we know the tract and patch, we will be able to load coadd images using butler." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "butler.getKeys(\"deepCoadd\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataId = {\"tract\": object_this[\"tract\"], \"patch\": object_this[\"patch\"], \"filter\": \"i\"}\n", "print(dataId)\n", "\n", "full_patch = butler.get(\"deepCoadd\", dataId=dataId)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's take a look at the full patch first\n", "fig = plt.figure(figsize=(8, 8), dpi=100)\n", "# Note that we set frame=1 below to allow afwDisplay to use the figure instance we created\n", "display = afwDisplay.Display(1, backend='matplotlib')\n", "display.scale(\"linear\", \"zscale\")\n", "display.mtv(full_patch.getMaskedImage().getImage())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a postage stamp!\n", "\n", "To make a postage stamp, we can simply use the `getCutout` method to obtain the cutout from the full patch data. \n", "We can also obtain WCS information so that we can show RA/Dec values on teh axes. \n", "\n", "Here we will use plain matplotlib to make our postage stamp, because I assume most people are more familar with matplotlib than `afw.display`! \n", "However, `afw.display` provides some useful functions and powerful integration with ds9 and others, if you get some time to learn it! " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define the center and size of our cutout\n", "radec = lsst.geom.SpherePoint(object_this[\"ra\"], object_this[\"dec\"], lsst.geom.degrees)\n", "cutout_size = 300 # 300 pixels -> about 1 arcmin (we'll see why in a bit!)\n", "cutout_extent = lsst.geom.ExtentI(cutout_size, cutout_size)\n", "\n", "# Retrieve cutout\n", "cutout = full_patch.getCutout(radec, cutout_extent)\n", "\n", "# Retrieve wcs\n", "wcs = cutout.getWcs()\n", "print(wcs)\n", "wcs_fits_meta = wcs.getFitsMetadata()\n", "\n", "# Retrieve the image array\n", "image_arr = cutout.getMaskedImage().getImage().array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# make plot with astropy.wcs and matplotlib\n", "\n", "fig, ax = plt.subplots(subplot_kw={'projection': WCS(wcs_fits_meta)}, figsize=(4, 4), dpi=100)\n", "\n", "vmin, vmax = ZScaleInterval().get_limits(image_arr)\n", "ax.imshow(image_arr, vmin=vmin, vmax=vmax, cmap='binary_r', origin='lower')\n", "\n", "ax.set_xlabel(\"RA\")\n", "ax.set_ylabel(\"Dec\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A different way to load cutout image by specifying a bbox" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "center = skymap.findTract(radec).getWcs().skyToPixel(radec)\n", "bbox = lsst.geom.BoxI(lsst.geom.Point2I((center.x - cutout_size*0.5, center.y - cutout_size*0.5)), cutout_extent)\n", "\n", "# Note the postfix `_sub` added to the dataset type!! Note how we skip the `full_patch` step here! \n", "cutout = butler.get(\"deepCoadd_sub\", dataId=dataId, bbox=bbox)\n", "\n", "wcs_fits_meta = cutout.getWcs().getFitsMetadata()\n", "image_arr = cutout.getMaskedImage().getImage().array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We should obtain the same image as above! \n", "\n", "fig, ax = plt.subplots(subplot_kw={'projection': WCS(wcs_fits_meta)}, figsize=(4, 4), dpi=100)\n", "\n", "vmin, vmax = ZScaleInterval().get_limits(image_arr)\n", "ax.imshow(image_arr, vmin=vmin, vmax=vmax, cmap='binary_r', origin='lower')\n", "\n", "ax.set_xlabel(\"RA\")\n", "ax.set_ylabel(\"Dec\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the bbox method, we can make the loading time shorter (usually 2-3x speed up). \n", "This is useful when you need to make many cutouts! " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%timeit\n", "cutout = butler.get(\"deepCoadd\", dataId=dataId).getCutout(radec, cutout_extent)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%timeit\n", "cutout = butler.get(\"deepCoadd_sub\", dataId=dataId, bbox=bbox)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make a false color RGB image\n", "\n", "Now we have all the tools to make a false color RGB image. We will need to load in images from three bands. Here we will use g, r, and i bands. We then can use `lsst.afw.display.rgb.makeRGB` to generate the false color image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note how we directly supply keyword arguments instead of a single dataId dictionary here\n", "# Also note the band ordering. We will use \"irg\" for \"RGB\", respectively. \n", "cutouts = [butler.get(\"deepCoadd_sub\", bbox=bbox, tract=object_this[\"tract\"], patch=object_this[\"patch\"], filter=band) for band in \"irg\"]\n", "\n", "wcs_fits_meta = cutouts[0].getWcs().getFitsMetadata()\n", "image_rgb = rgb.makeRGB(*cutouts)\n", "del cutouts # let gc save some memory for us\n", "\n", "fig, ax = plt.subplots(subplot_kw={'projection': WCS(wcs_fits_meta)}, figsize=(4, 4), dpi=100)\n", "ax.imshow(image_rgb, origin='lower')\n", "ax.set_xlabel(\"RA\")\n", "ax.set_ylabel(\"Dec\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nice image, right? And it appears to be a blended object with a miscentering! \n", "\n", "## Yay! Let's put this all together now! \n", "\n", "This final cell is going to take a bit longer (about 50-60 seconds) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize=(16, 16), dpi=100)\n", "gs = plt.GridSpec(4, 4, fig)\n", "\n", "cutout_size = 300\n", "cutout_extent = lsst.geom.ExtentI(cutout_size, cutout_size)\n", "\n", "for (_, object_this), gs_this in zip(objects.iterrows(), gs):\n", " radec = lsst.geom.SpherePoint(object_this[\"ra\"], object_this[\"dec\"], lsst.geom.degrees)\n", " center = skymap.findTract(radec).getWcs().skyToPixel(radec)\n", " bbox = lsst.geom.BoxI(lsst.geom.Point2I((center.x - cutout_size*0.5, center.y - cutout_size*0.5)), cutout_extent)\n", "\n", " cutouts = [butler.get(\"deepCoadd_sub\", bbox=bbox, tract=object_this[\"tract\"], patch=object_this[\"patch\"], filter=band) for band in \"irg\"]\n", " wcs_fits_meta = cutouts[0].getWcs().getFitsMetadata()\n", " image_rgb = rgb.makeRGB(*cutouts)\n", " del cutouts # let gc save some memory for us\n", "\n", " ax = plt.subplot(gs_this, projection=WCS(wcs_fits_meta), label=str(object_this[\"objectId\"]))\n", " ax.imshow(image_rgb, origin='lower')\n", " del image_rgb # let gc save some memory for us\n", " \n", " for c in ax.coords:\n", " c.set_ticklabel(exclude_overlapping=True, size=10)\n", " c.set_axislabel('', size=0)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "desc-stack-weekly-latest", "language": "python", "name": "desc-stack-weekly-latest" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" } }, "nbformat": 4, "nbformat_minor": 4 }