{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using the LSST Stack tools to do positional matching on coadd and src catalogs\n", "
Owner: **Jim Chiang** ([@jchiang87](https://github.com/LSSTDESC/DC2-analysis/issues/new?body=@jchiang87))\n", "
Updated by: **Javier Sánchez** ([@fjaviersanchez](https://github.com/LSSTDESC/DC2-analysis/issues/new?body=@fjaviersanchez))\n", "
Last Run: **2020-06-12**\n", "\n", "In this notebook, we use the data butler to retrieve catalogs from coadd and visit-level analyses of Run2.2i, and use the `lsst.afw.table.matchRaDec` function to do positional matching against galaxy truth info extracted from the cosmoDC2 v1.1.4 extragalactic catalog. To enable this, we show how to create a `SourceCatalog` object from the galaxy truth info provided by the GCR interface.\n", "\n", "## Set Up" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "from collections import namedtuple\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import lsst.afw.geom as afw_geom\n", "import lsst.afw.table as afw_table\n", "import lsst.daf.persistence as dp\n", "import GCRCatalogs\n", "import healpy as hp\n", "import lsst.geom" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How To Create a Source Catalog\n", "\n", "In order to use the Stack's spatial matching code, we will need to reformat the extragalactic catalog galaxy position and magnitude information into an afw `SourceCatalog` object. These table objects are initialized by a \"schema\", which in turn is built from column definition \"Coldef\" objects. The `mag_cols` function below shows how these Coldefs can be created." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def make_SourceCatalog(new_cols):\n", " \"\"\"\n", " Make a SourceCatalog to contain id and coordinates for each object, plus any new\n", " columns.\n", " \n", " Parameters\n", " ----------\n", " new_cols: list of Coldefs\n", " Column info for adding to an afw.table schema.\n", "\n", " Returns\n", " -------\n", " lsst.afw.table.SourceCatalog: An empty SourceCatalog with the desired schema.\n", " \"\"\"\n", " # The minimal schema just contains the `id`, `coord_ra`, and `coord_dec` fields.\n", " schema = afw_table.SourceTable.makeMinimalSchema()\n", " for coldef in new_cols:\n", " schema.addField(coldef.name, type=coldef.type, doc=coldef.doc)\n", " return afw_table.SourceCatalog(schema)\n", "\n", "\n", "def mag_cols(bands):\n", " \"\"\"Return column information for adding magnitude columns to an afw.table schema.\"\"\"\n", " Coldef = namedtuple('Coldef', 'name type doc'.split())\n", " return [Coldef(f'mag_{x}', float, f'{x}-magnitude')\n", " for x in bands]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting Galaxies\n", "\n", "We are going to downselect cosmoDC2 to the sky region being considered. We'll make a general-purpose RegionSelector class, and then sub-class it for selecting objects in either CCDs or patches." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class RegionSelector:\n", " \"\"\"\n", " Class to rotate the protoDC2 galaxies to the Run1.1p sky location and downselect those galaxies\n", " based on a magnitude limit and on the coordinates of the subregion (i.e., patch or CCD) being\n", " considered.\n", " \"\"\"\n", "\n", " def __init__(self):\n", " pass\n", " \n", " def _set_coord_range(self, bbox, wcs):\n", " \"\"\"\n", " Set the coordinate range of the region.\n", " \n", " Notes\n", " -----\n", " This method is used by the RegionSelector's subclasses.\n", " \n", " Parameters\n", " ----------\n", " bbox: Calexp.BBox\n", " Defines corners of region's bounding box\n", " wcs: Calexp.Wcs\n", " Defines pixel to world (sky) coordinate transformation\n", " \"\"\"\n", " region_box = lsst.geom.Box2D(bbox)\n", " corners = region_box.getCorners()\n", " ra_values, dec_values = [], []\n", " for corner in corners:\n", " ra, dec = wcs.pixelToSky(corner)\n", " ra_values.append(ra.asDegrees())\n", " dec_values.append(dec.asDegrees())\n", " self.ra_range = min(ra_values), max(ra_values)\n", " self.dec_range = min(dec_values), max(dec_values)\n", "\n", " def __call__(self, gc, band, max_mag):\n", " \"\"\"\n", " Create a SourceCatalog object from the input galaxy catalog for the specified band, and\n", " apply the region and magnitude cuts.\n", "\n", " Parameters\n", " ----------\n", " gc: GCRCatalogs GalaxyCatalog\n", " The galaxy catalog obtained via GCR.\n", " band: str\n", " The band, e.g., 'i', to use for the magnitude comparison with the values measured\n", " from the simulated data.\n", " max_mag: float\n", " The magnitude limit to apply.\n", "\n", " Returns\n", " -------\n", " lsst.afw.table.SourceCatalog\n", " \"\"\"\n", " # Retrieve the healpix pixels corresponding to the catalog so we don't query the full catalog\n", " vertices = hp.ang2vec(np.array([self.ra_range[0], self.ra_range[1],\n", " self.ra_range[1], self.ra_range[0]]),\n", " np.array([self.dec_range[0], self.dec_range[0],\n", " self.dec_range[1], self.dec_range[1]]), lonlat=True)\n", " ipix = hp.query_polygon(32, vertices, inclusive=True)\n", " # We are going to pass the healpixels that overlap with our catalog as native filters to speed up the process\n", " native_filter = f'(healpix_pixel == {ipix[0]})'\n", " for ipx in ipix:\n", " native_filter=native_filter+f' | (healpix_pixel == {ipx})'\n", " # Retrieve the desired columns and cut on the magnitude values.\n", " bandname = f'mag_{band}'\n", " filter_ = f'{bandname} < {max_mag}'\n", " print(\"Applying magnitude filter:\", filter_)\n", " gc_cols = gc.get_quantities(['galaxy_id', 'ra', 'dec',\n", " bandname], filters=[filter_, \n", " f'ra > {self.ra_range[0]}',\n", " f'ra < {self.ra_range[1]}',\n", " f'dec > {self.dec_range[0]}',\n", " f'dec < {self.dec_range[1]}',\n", " ],\n", " native_filters = native_filter)\n", " print(\"Number of galaxies within region:\", len(gc_cols['ra']))\n", "\n", " # Create a SourceCatalog with the galaxy_ids, coordinates, magnitudes\n", " galaxy_catalog = make_SourceCatalog(mag_cols((band,)))\n", " for id_, ra, dec, mag in zip(gc_cols['galaxy_id'], gc_cols['ra'], gc_cols['dec'], gc_cols[bandname]):\n", " record = galaxy_catalog.addNew()\n", " record.set('id', id_)\n", " record.set('coord_ra', lsst.geom.Angle(ra, lsst.geom.degrees))\n", " record.set('coord_dec', lsst.geom.Angle(dec, lsst.geom.degrees))\n", " record.set(f'mag_{band}', mag)\n", " return galaxy_catalog" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class CcdSelector(RegionSelector):\n", " \"\"\"RegionSelector to use with visit-level calexps, i.e., single CCD exposures.\"\"\"\n", " def __init__(self, butler, visit, raft, sensor):\n", " super(CcdSelector, self).__init__()\n", " # Get the CCD boundaries\n", " dataId = dict(visit=visit, raft=raft, sensor=sensor)\n", " calexp = butler.get('calexp', dataId=dataId)\n", " self._set_coord_range(calexp.getBBox(), calexp.getWcs())\n", "\n", "\n", "class PatchSelector(RegionSelector):\n", " \"\"\"RegionSelector to use with skyMap patches, i.e., coadd data.\"\"\"\n", " def __init__(self, butler, tract, patch):\n", " super(PatchSelector, self).__init__()\n", " # Get the patch boundaries.\n", " skymap = butler.get('deepCoadd_skyMap')\n", " tractInfo = skymap[tract]\n", " patchInfo = tractInfo.getPatchInfo(eval(patch))\n", " self._set_coord_range(patchInfo.getOuterBBox(), tractInfo.getWcs())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matching Experiments\n", "\n", "Now we have the tools we need, let's read in the Run 2.2i-DR3 coadd catalog data and match it to the extragalactic catalog input. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a data butler for the repo.\n", "repo = '/global/cfs/cdirs/lsst/production/DC2_ImSim/Run2.2i/desc_dm_drp/v19.0.0-v1/rerun/run2.2i-coadd-wfd-dr3-v1'\n", "butler = dp.Butler(repo)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sources or Objects?\n", "\n", "We can consider visit-level src catalog data, in which case we would provide a `dataId` to the butler with (`visit`, `raft`, `sensor`) ids; or we can consider coadd object data, for which we would provid a `dataId` with (`filter`, `tract`, `patch`) ids.\n", "\n", "Somewhat different flux models are available in the Run2.2i data for src catalogs versus coadd catalogs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mag_max = 24.5\n", "# we_are_matching = \"Sources\"\n", "we_are_matching = \"Objects\"\n", "\n", "if we_are_matching == \"Sources\":\n", " # Get the src catalog for a selected visit, raft, and sensor:\n", " visit = 219976\n", " raft = '2,2'\n", " sensor = '1,1'\n", " title = f'Run2.2i-DR3, visit={visit}, raft={raft}, sensor={sensor}'\n", " dataId = dict(visit=visit, raft=raft, sensor=sensor)\n", " catalog = butler.get('src', dataId=dataId)\n", " calexp = butler.get('calexp', dataId=dataId)\n", " filter_ = calexp.getInfo().getFilter().getFilter()\n", " calib = calexp.getPhotoCalib()\n", " #flux_model = 'ext_photometryKron_KronFlux'\n", " flux_model = 'modelfit_CModel'\n", " region_selector = CcdSelector(butler, visit, raft, sensor)\n", "\n", "else:\n", " # Get the coadd catalog for a selected filter, tract, and patch:\n", " filter_ = 'r'\n", " tract = 4638\n", " patch = '2,2'\n", " title = f'Run2.2i-DR3, filter={filter_}, tract={tract}, patch={patch}'\n", " dataId = dict(tract=tract, patch=patch, filter=filter_)\n", " catalog = butler.get('deepCoadd_meas', dataId=dataId)\n", " calexp = butler.get('deepCoadd', dataId=dataId)\n", " calib = calexp.getPhotoCalib()\n", " flux_model = 'modelfit_CModel'\n", " region_selector = PatchSelector(butler, tract, patch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Focusing on Well-measured Galaxies\n", "\n", "Galaxies can be selected as extended objects (or sources) using the `base_ClassificationExtendedness_value`. We use the model flag and flux to ensure that a flux value could be measured, and then apply a selection to ensure that we get deblended objects. Finally, we apply a relatively bright magnitude cut, to avoid confusion when performing the positional match." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Extract columns to use in the selection:\n", "ext = catalog.get('base_ClassificationExtendedness_value')\n", "model_flag = catalog.get(flux_model + '_flag')\n", "model_flux = catalog.get(flux_model + '_instFlux')\n", "num_children = catalog.get('deblend_nChild')\n", "\n", "# Apply the extendedness, flag, and blendedness cuts using the subset method:\n", "cat_temp = catalog.subset((ext == 1) &\n", " (model_flag == False) &\n", " (model_flux > 0) &\n", " (num_children == 0))\n", "# Extract the magnitude and again use subset to apply the depth cut:\n", "_mag = calib.instFluxToMagnitude(cat_temp, flux_model) \n", "mag = _mag[:,0] #Magnitude\n", "magerr = _mag[:,1] #Magnitude Error\n", "#print(mag, len(mag))\n", "cat_temp = cat_temp.subset(mag < mag_max)\n", "\n", "# Repackage everything in a more minimal SourceCatalog and add a magnitude column for comparing\n", "# to the galaxy catalog truth values.\n", "drp_catalog = make_SourceCatalog(mag_cols((filter_,)))\n", "for record in cat_temp:\n", " new_rec = drp_catalog.addNew()\n", " for name in 'id coord_ra coord_dec parent'.split():\n", " new_rec.set(name, record[name])\n", " mag = calib.instFluxToMagnitude(record, flux_model)\n", " new_rec.set(f'mag_{filter_}', mag.value)\n", " \n", "print(\"Number of observed objects in our DRP galaxy catalog\", len(drp_catalog))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extracting the Truth Info\n", "\n", "We can now use our region_selector object to process the protoDC2 extragalactic catalog. Note that while we instantiated it with a butler, so that it could work on selecting galaxy observations from either a tract or a CCD, we can _call_ it as a function (via its `__call__` method) which takes a GCR catalog object as its first argument. The result will be a DM Stack SourceCatalog object, that we can match to our observed DRP catalog." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Read in the galaxy catalog data.\n", "with warnings.catch_warnings():\n", " warnings.filterwarnings('ignore')\n", " gc = GCRCatalogs.load_catalog('cosmoDC2_v1.1.4_image')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create a SourceCatalog from the gc data, applying the region and magnitude selections.\n", "galaxy_catalog = region_selector(gc, band=filter_, max_mag=mag_max)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the number of galaxies within our region with th enumber of observed galaxies in the DRP catalog. Is this what you would expect?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Doing the Matching\n", "We can now carry out the spatial matching, and compute some quantities to plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Find positional matches within 100 milliarcseconds:\n", "radius = lsst.geom.Angle(0.1, lsst.geom.arcseconds)\n", "matches = afw_table.matchRaDec(drp_catalog, galaxy_catalog, radius)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`matches` is a list of `match` objects, each one containing an observed-true matchd galaxy pair. The code below shows how to work with these, looping over the matches and extracting information to plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compare magnitudes for matched objects:\n", "drp_mag = np.zeros(len(matches), dtype=np.float)\n", "gc_mag = np.zeros(len(matches), dtype=np.float)\n", "sep = np.zeros(len(matches), dtype=np.float)\n", "# Arrays for a quiver plot.\n", "u = np.zeros(len(matches), dtype=np.float)\n", "v = np.zeros(len(matches), dtype=np.float)\n", "for i, match in enumerate(matches):\n", " drp_mag[i] = match.first[f'mag_{filter_}']\n", " gc_mag[i] = match.second[f'mag_{filter_}']\n", " sep[i] = np.degrees(match.distance)*3600.*1000.\n", " u[i] = match.first['coord_ra'] - match.second['coord_ra']\n", " v[i] = match.first['coord_dec'] - match.second['coord_dec']\n", "print(\"Number of matches:\", len(matches))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Start a 2x2 panel figure:\n", "fig = plt.figure(figsize=(8, 8))\n", "frame_axes = fig.add_subplot(111, frameon=False)\n", "frame_axes.set_title(title)\n", "frame_axes.get_xaxis().set_ticks([])\n", "frame_axes.get_yaxis().set_ticks([])\n", "\n", "delta_mag = drp_mag - gc_mag # Observed - True\n", "\n", "# Upper Left: Histogram of match separations.\n", "fig.add_subplot(2, 2, 1)\n", "plt.hist(sep, range=(0, 100), histtype='step', bins=40)\n", "plt.xlabel('separation (marcsec)')\n", "plt.ylabel('entries / bin')\n", "\n", "# Upper Right: Quiver plot of (DRP - galaxy_catalog) positions on the sky.\n", "fig.add_subplot(2, 2, 2)\n", "plt.quiver(np.degrees(drp_catalog['coord_ra']),\n", " np.degrees(drp_catalog['coord_dec']),\n", " u, v)\n", "plt.xlabel('RA (deg)')\n", "plt.ylabel('Dec (deg)')\n", "\n", "# Lower left: Difference in magnitudes vs true magnitude (mag_gc).\n", "fig.add_subplot(2, 2, 3)\n", "plt.errorbar(gc_mag, delta_mag, fmt='.', alpha=0.1)\n", "plt.xlabel(f'True mag {filter_}_gc'.format(filter_))\n", "plt.ylabel(f'Mag difference ({filter_}_gc - {filter_}_drp)')\n", "\n", "# Difference in magnitudes vs separation.\n", "fig.add_subplot(2, 2, 4)\n", "plt.errorbar(sep, delta_mag, fmt='.', alpha=0.1)\n", "plt.xlabel('separation (mas)')\n", "plt.ylabel(f'Mag difference ({filter_}_gc - {filter_}_drp)')\n", "\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Going Further\n", "\n", "The available columns in a `SourceCatalog` can be seen by printing the info from the schema that it carries around with it. The cells below show you what you have available. The drp_catalog and galaxy_catalog that we made to do the spatial matching only have positions and magnitudes in them - but the parent catalogs have many more quantities. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for item in drp_catalog.schema:\n", " print(f\"{item.field.getName()}: {item.field.getDoc()}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for item in galaxy_catalog.schema:\n", " print(f\"{item.field.getName()}: {item.field.getDoc()}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# All the DRP measurements:\n", "for item in catalog.schema:\n", " print(f\"{item.field.getName()}: {item.field.getDoc()}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# All the cosmoDC2 parameters:\n", "# help(gc)\n", "gc.list_all_quantities()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "desc-stack", "language": "python", "name": "desc-stack" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 4 }