{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Healsparse Tutorial for DESC Sprint Week 11-30-2020\n", "\n", "This is short tutorial on how to use [healsparse](https://healsparse.readthedocs.io/en/latest/) and [suprême](https://github.com/LSSTDESC/supreme) survey property maps for the DC2 imaging.\n", "\n", "HealSparse is a sparse implementation of HEALPix in Python, written for the Rubin Observatory Legacy Survey of Space and Time Dark Energy Science Collaboration (DESC). HealSparse is a pure Python library that sits on top of numpy and healpy and is designed to avoid storing full sky maps in case of partial coverage, including easy reading of sub-maps. This reduces the overall memory footprint allowing maps to be rendered at arcsecond resolution while keeping the familiarity and power of healpy.\n", "\n", "Note that healsparse does not magically make memory appear. If you need to do correlation functions at high resolution over the full sky, this will not help you. But if you need to do value look-ups for survey properties, as well as generating maps and masks from geometric primitives, the healsparse maps allow you to focus on a smaller region of sky at high resolution in a way that native healpy does not." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import healpy as hp\n", "import healsparse" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To conserve memory, HealSparse uses a dual-map approach, where a low-resolution full-sky “coverage map” is combined with a high resolution map containing the pixel data where it is available. The resolution of the coverage map is controlled by the `nside_coverage` parameter, and the resolution of the high-resolution map is controlled by the `nside_sparse` parameter. Behind the scenes, HealSparse uses clever indexing to allow the user to treat these as contiguous maps with minimal overhead. **All HealSparse maps use HEALPix nest indexing behind the scenes, should be treated as nest-indexed maps.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us begin by creating a healsparse map at high resolution. A coverage `nside_coverage = 32` is a good default value for a balance between memory usage and efficiency, though if you have a very high resolution map over a small area you might want a larger `nside_coverage` (higher resolution)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# To create a new map, the resolutions and datatype must be specified\n", "nside_coverage = 32\n", "nside_sparse = 16384\n", "map1 = healsparse.HealSparseMap.make_empty(nside_coverage, nside_sparse, np.float64)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To set values in the map, you can use simple indexing or the explicit API:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map1[0: 1000] = np.arange(1000, dtype=np.float64)\n", "map1.update_values_pix(np.arange(1000, 2000), np.arange(1000, dtype=np.float64))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To retrieve values from the map, you can use simple indexing or the explicit API via pixels or positions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map1[0: 1000])\n", "print(map1.get_values_pix(np.arange(1000, 2000), nest=True))\n", "print(map1.get_values_pos(45.0, 0.1, lonlat=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A HealSparseMap has the concept of “valid pixels”, the pixels over which the map is defined (as opposed to `healpy.UNSEEN` in the case of floating point maps). You can retrieve the array of valid pixels or the associated positions of the valid pixels easily:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map1.valid_pixels)\n", "\n", "ra, dec = map1.valid_pixels_pos(lonlat=True)\n", "plt.plot(ra, dec, 'r.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can convert a HealSparseMap to a healpy map (numpy array) either by using a full slice (`[:]`) or with the `generate_healpix_map()` method. Do watch out, at high resolution this can blow away your memory! In these cases, `generate_healpix_map()` can degrade the map before conversion, using a reduction function (over valid pixels) of your choosing, including `mean`, `median`, `std`, `max`, `min`, `and`, `or`, `sum`, and `prod` (product)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hpmap4096 = map1.generate_healpix_map(nside=4096, reduction='mean')\n", "gdpix, = np.where(hpmap4096 > hp.UNSEEN)\n", "ra, dec = hp.pix2ang(4096, gdpix, nest=True, lonlat=True)\n", "plt.plot(ra, dec, 'r.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Integer Maps\n", "In addition to floating-point maps, which are natively supported by `healpy`, HealSparseMap supports integer maps. The “sentinel” value of these maps (equivalent to `healpy.UNSEEN`) is either `-MAXINT` or `0`, depending on the desired use of the map (e.g., integer values or positive bitmasks). Note that these maps *cannot* be trivially converted to healpy maps because HEALPix has no concept of sentinel values that are not `healpy.UNSEEN`, which is a very large negative floating-point value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map_int = healsparse.HealSparseMap.make_empty(32, 4096, np.int32)\n", "print(map_int)\n", "map_int[0: 1000] = np.arange(1000, dtype=np.int32)\n", "print(map_int[500])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Recarray Maps\n", "\n", "HealSparseMap also supports maps made up of numpy record arrays. These recarray maps will have one field that is the “primary” field which is used to test if a pixel has a valid value or not. Therefore, these recarray maps should be used to describe associated values that share the exact same valid footprint. Each field in the recarray can be treated as its own HealSparseMap. For example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dtype = [('a', np.float32), ('b', np.int32)]\n", "\n", "map_rec = healsparse.HealSparseMap.make_empty(32, 4096, dtype, primary='a')\n", "\n", "map_rec[0: 10000] = np.zeros(10000, dtype=dtype)\n", "print(map_rec.valid_pixels)\n", "map_rec['a'][0: 5000] = np.arange(5000, dtype=np.float32)\n", "map_rec['b'][5000: 10000] = np.arange(5000, dtype=np.int32)\n", "print(map_rec[map_rec.valid_pixels])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the call `map_rec['a'][0: 5000] = values` will work, but `map_rec[0: 5000]['a'] = values` will not. Also note that using the fields of the recarray cannot be used to set new pixels, this construction can only be used to change pixel values." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Wide Masks\n", "HealSparse has support for “wide” bit masks with an arbitrary number of bits that are referred to by bit position rather than value. This is useful, for example, when constructing a coadd coverage map where every pixel can uniquely identify the set of input exposures that contributed at the location of that pixel. In the case of >64 input exposures you can no longer use a simple 64-bit integer bit mask. Wide mask bits are always specified by giving a list of integer positions rather than values (e.g., use 10 to set the 10th bit instead of `1024 = 2**10`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map_wide = healsparse.HealSparseMap.make_empty(32, 4096, healsparse.WIDE_MASK, wide_mask_maxbits=128)\n", "\n", "pixels = np.arange(10000)\n", "map_wide.set_bits_pix(pixels, [4, 100])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map_wide.check_bits_pix(pixels, [2]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map_wide.check_bits_pix(pixels, [4]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map_wide.check_bits_pix(pixels, [100]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(map_wide.check_bits_pix(pixels, [101]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check if any of the bits are set\n", "print(map_wide.check_bits_pos([45.2], [0.2], [100, 101], lonlat=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Writing Maps\n", "Writing a HealSparseMap is easy. Note that all maps except for recarray maps are stored with lossless fits compression and are `.fz` files under-the-hood. The convention that we have adopted is that all healsparse files are stored with the `.hs` extention. Fits compression is very efficient for integer maps, and the lossless gzip compression is actually quite good for maps with holes / empty regions that therefore have repeated `UNSEEN` or equivalent values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map1.write('output_file.hs', clobber=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the DC2 DR6 suprême maps\n", "The survey property maps for DC2 DR6 are available at `/global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/addons/supreme/dr6-wfd`. These maps were generated with `nside_sparse = 32768` (and `nside_coverage = 32`)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import glob\n", "import os\n", "# These maps are available for ugrizy, let's just look at the g-band at the moment:\n", "files = sorted(glob.glob('/global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/addons/supreme/dr6-wfd/*_g_*.hs'))\n", "for f in files:\n", " print(os.path.basename(f))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading the coverage map\n", "Let us start with the exposure time map, and let's start by reading just the coverage map at low resolution:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map_name = '/global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/addons/supreme/dr6-wfd/supreme_dc2_dr6d_v3_g_exptime_sum.hs'\n", "cov_map = healsparse.HealSparseCoverage.read(map_name)\n", "cov_mask = cov_map.coverage_mask\n", "cov_pixels, = np.where(cov_mask)\n", "ra, dec = hp.pix2ang(cov_map.nside_coverage, cov_pixels, lonlat=True, nest=True)\n", "plt.plot(ra, dec, 'r.')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading in a partial map\n", "One advantage of the healsparse format is that we can read in just one (or more than one) coverage pixel to conserve memory (and this is very useful when running code parallelized over different regions of the sky where you do not need the full sky coverage)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sub_map = healsparse.HealSparseMap.read(map_name, pixels=[cov_pixels[21]])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_pixels, ra, dec = sub_map.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra, dec, sub_map[valid_pixels])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's zoom in here...\n", "plt.hexbin(ra, dec, sub_map[valid_pixels], extent=[66.0, 66.4, -42.0, -41.2])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's look at how this is degraded to 4096... this is the highest resolution that is practical to\n", "# use a healpy map on a laptop.\n", "sub_map_dg = sub_map.degrade(nside_out=4096, reduction='mean')\n", "valid_pixels_dg, ra_dg, dec_dg = sub_map_dg.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra_dg, dec_dg, sub_map_dg[valid_pixels_dg], extent=[66.0, 66.4, -42.0, -41.2], gridsize=25)\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Map Operations\n", "We can read in another map and perform arithmetic operations between the maps (if you so desire). We will read in a larger region of the second map to demonstrate some options." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map_name2 = '/global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/addons/supreme/dr6-wfd/supreme_dc2_dr6d_v3_r_exptime_sum.hs'\n", "sub_map2 = healsparse.HealSparseMap.read(map_name2, pixels=cov_pixels[20: 22])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We can add or multiply a map by a constant\n", "sub_map_times_2 = sub_map*2\n", "plt.hexbin(ra, dec, sub_map_times_2[valid_pixels])\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When operating on more than one map we have to decide if we are going to take the intersection of the valid pixels of the two maps, or the union. If we take the intersection, for summation the blank pixels are given a value of 0; for multiplication, a value of 1." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# An intersection addition\n", "sub_map_sum = healsparse.operations.sum_intersection([sub_map, sub_map2])\n", "valid_pixels_sum, ra_sum, dec_sum = sub_map_sum.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra_sum, dec_sum, sub_map_sum[valid_pixels_sum])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A union addition\n", "sub_map_sum = healsparse.operations.sum_union([sub_map, sub_map2])\n", "valid_pixels_sum, ra_sum, dec_sum = sub_map_sum.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra_sum, dec_sum, sub_map_sum[valid_pixels_sum])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Looking at another map can be interesting.\n", "# This is the effect of differential chromatic refraction on the psf ellipticity e1. It has a pattern that looks nothing like the exposure time!\n", "map_name = '/global/cfs/cdirs/lsst/shared/DC2-prod/Run2.2i/addons/supreme/dr6-wfd/supreme_dc2_dr6d_v3_g_dcr_e1_wmean.hs'\n", "sub_map = healsparse.HealSparseMap.read(map_name, pixels=[cov_pixels[21]])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_pixels, ra, dec = sub_map.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra, dec, sub_map[valid_pixels], extent=[66.0, 66.4, -42.0, -41.2])\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## HealSparse Geometry\n", "Finally, we will look at the healsparse geometry library which is built on top of healpy.\n", "\n", "HealSparse has a basic geometry library that allows you to generate maps from circles and convex polygons, as supported by healpy. Each geometric object is associated with a single value. On construction, geometry objects only contain information about the shape, and they are only rendered onto a HEALPix grid when requested.\n", "\n", "There are two methods to realize geometry objects. The first is that each object can be used to generate a HealSparseMap map, and the second, for integer-valued objects is the realize_geom() method which can be used to combine multiple objects by or-ing the integer values together." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# All units are decimal degrees\n", "circ = healsparse.Circle(ra=200.0, dec=0.0, radius=1.0, value=1)\n", "poly = healsparse.Polygon(ra=[200.0, 200.2, 200.3, 200.2, 200.1],\n", " dec=[0.0, 0.1, 0.2, 0.25, 0.13],\n", " value=8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Making a Map\n", "To make a map from a geometry object, use the `get_map()` method as such. The higher resolution you choose, the better the aliasing at the edges (given that these are pixelized approximations of the true shapes). You can also combine two maps using the general operations. Note that if the polygon is an integer value, the default sentinel when using `get_map()` is 0." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smap_poly = poly.get_map(nside_coverage=32, nside_sparse=32768, dtype=np.int16)\n", "smap_circ = circ.get_map(nside_coverage=32, nside_sparse=32768, dtype=np.int16)\n", "\n", "combo = healsparse.or_union([smap_poly, smap_circ])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using realize_geom()\n", "You can only use `realize_geom()` to create maps from combinations of polygons if you are using integer maps, and want to or them together. This method is more memory efficient than generating each individual individual map and combining them, as above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "realized_combo = healsparse.HealSparseMap.make_empty(32, 32768, np.int16, sentinel=0)\n", "healsparse.realize_geom([poly, circ], realized_combo)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_pix, ra, dec = combo.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra, dec, combo[valid_pix])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_pix, ra, dec = realized_combo.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra, dec, realized_combo[valid_pix])\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Geometry in suprême\n", "In the suprême code, each ccd that was in the coadd is rendered as a polygon with a specific bit value into a `wide_mask` map. In this way, each patch gets an \"input map\" that describes at every nside=32768 pixel a combination of bit values that says which images were used at this specific location. This is the information that is used to compute the weighted-mean survey property quantities, rather than trying to realize these values into the geometry directly. This is very efficient." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making suprême maps\n", "Making suprême maps is easy. If you have `supreme` set up (technical details are beyond the scope of this tutorial, but feel free to ping me), you can use the `supreme_mapper.py` command-line script. Or you can do it in code. While `healsparse` is fully general and can be used anywhere that `healpy` is available, `supreme` depends on the Rubin Software Pipelines." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import supreme\n", "import lsst.daf.persistence as dafPersist\n", "import os" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scratchdir = os.environ['CSCRATCH']\n", "tutorial_path = os.path.join(scratchdir, 'healsparse_tutorial_1120')\n", "if not os.path.isdir(tutorial_path):\n", " os.makedirs(tutorial_path)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "butler = dafPersist.Butler('/global/cfs/cdirs/lsst/production/DC2_ImSim/Run2.2i/desc_dm_drp/v19.0.0-v1/rerun/run2.2i-coadd-wfd-dr6-v1-grizy/')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "configfile = os.path.join(tutorial_path, 'tutorial_test_config.yaml')\n", "with open(configfile, \"w\") as f:\n", " f.write(\"outbase: 'supreme_dc2_dr6_tutorial_test'\\n\")\n", " f.write(\"nside: 32768\\n\")\n", " f.write(\"map_types:\\n\")\n", " f.write(\" exptime: ['sum']\\n\")\n", " f.write(\" airmass: ['wmean']\\n\")\n", " f.write(\"use_calexp_mask: False\\n\")\n", " f.write(\"bad_mask_planes: ['BAD', 'NO_DATA', 'SAT', 'SUSPECT']\\n\")\n", " f.write(\"detector_id_name: detector\\n\")\n", " f.write(\"visit_id_name: visit\\n\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config = supreme.Configuration.load_yaml(configfile)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mapper = supreme.MultiMapper(butler, config, tutorial_path, ncores=1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tracts = [3076]\n", "filters = ['r']\n", "patches = ['2,2']\n", "# Consolidate = False says we are just running 1 patch, do not consolidate into a tract.\n", "mapper(tracts, filters, patches=patches, consolidate=False, clobber=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "airmass = healsparse.HealSparseMap.read(os.path.join(tutorial_path, '3076', 'patches', 'supreme_dc2_dr6_tutorial_test_03076_2,2_r_airmass_wmean.hs'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_pix, ra, dec = airmass.valid_pixels_pos(return_pixels=True)\n", "plt.hexbin(ra, dec, airmass[valid_pix])\n", "plt.colorbar()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "desc-stack-weekly-latest", "language": "python", "name": "desc-stack-weekly-latest" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" } }, "nbformat": 4, "nbformat_minor": 4 }