{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Astropy introduction for Gammapy users\n", "\n", "\n", "## Introduction\n", "\n", "To become efficient at using Gammapy, you have to learn to use some parts of Astropy, especially FITS I/O and how to work with Table, Quantity, SkyCoord, Angle and Time objects.\n", "\n", "Gammapy is built on Astropy, meaning that data in Gammapy is often stored in Astropy objects, and the methods on those objects are part of the public Gammapy API.\n", "\n", "This tutorial is a quick introduction to the parts of Astropy you should know become familiar with to use Gammapy (or also when not using Gammapy, just doing astronomy from Python scripts). The largest part is devoted to tables, which are a the most important building block for Gammapy (event lists, flux points, light curves, ... many other thing are store in Table objects).\n", "\n", "We will:\n", "\n", "- open and write fits files with [io.fits](http://docs.astropy.org/en/stable/io/fits/index.html)\n", "- manipulate [coordinates](http://docs.astropy.org/en/stable/coordinates/): [SkyCoord](http://docs.astropy.org/en/stable/api/astropy.coordinates.SkyCoord.html) and [Angle](http://docs.astropy.org/en/stable/coordinates/angles.html) classes\n", "- use [units](http://docs.astropy.org/en/stable/units/index.html) and [Quantities](http://docs.astropy.org/en/stable/api/astropy.units.Quantity.html). See also this [tutorial](http://www.astropy.org/astropy-tutorials/Quantities.html)\n", "- manipulate [Times and Dates](http://docs.astropy.org/en/stable/time/index.html)\n", "- use [tables](http://docs.astropy.org/en/stable/table/index.html) with the [Table](http://docs.astropy.org/en/stable/api/astropy.table.Table.html) class with the Fermi catalog\n", "- define regions in the sky with the [region](http://astropy-regions.readthedocs.io/en/latest/getting_started.html) package\n", "\n", "## Setup" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# to make plots appear in the notebook\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If something below doesn't work, here's how you\n", "# can check what version of Numpy and Astropy you have\n", "# All examples should work with Astropy 1.3,\n", "# most even with Astropy 1.0\n", "from pathlib import Path\n", "import numpy as np\n", "import astropy\n", "import os\n", "\n", "print(\"numpy:\", np.__version__)\n", "print(\"astropy:\", astropy.__version__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Units, Quantities and constants\n", "import astropy.units as u\n", "from astropy.units import Quantity\n", "import astropy.constants as cst\n", "\n", "from astropy.io import fits\n", "from astropy.table import Table\n", "from astropy.coordinates import SkyCoord, Angle\n", "from astropy.time import Time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Units and constants\n", "\n", "### Basic usage" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# One can create a Quantity like this\n", "L = Quantity(1e35, unit=\"erg/s\")\n", "# or like this\n", "d = 8 * u.kpc\n", "\n", "# then one can produce new Quantities\n", "flux = L / (4 * np.pi * d ** 2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# And convert its value to an equivalent unit\n", "flux.to(\"erg cm-2 s-1\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "flux.to(\"W/m2\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More generally a Quantity is a numpy array with a unit." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "E = np.logspace(1, 4, 10) * u.GeV\n", "E.to(\"TeV\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we compute the interaction time of protons." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_eff = 30 * u.mbarn\n", "density = 1 * u.cm ** -3\n", "\n", "interaction_time = (density * x_eff * cst.c) ** -1\n", "\n", "interaction_time.to(\"Myr\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Use Quantities in functions\n", "\n", "We compute here the energy loss rate of an electron of kinetic energy E in magnetic field B. See formula (5B10) in this [lecture](http://www.cv.nrao.edu/course/astr534/SynchrotronPower.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def electron_energy_loss_rate(B, E):\n", " \"\"\" energy loss rate of an electron of kinetic energy E in magnetic field B\n", " \"\"\"\n", " U_B = B ** 2 / (2 * cst.mu0)\n", " gamma = (\n", " E / (cst.m_e * cst.c ** 2) + 1\n", " ) # note that this works only because E/(cst.m_e*cst.c**2) is dimensionless\n", " beta = np.sqrt(1 - 1 / gamma ** 2)\n", " return 4.0 / 3.0 * cst.sigma_T * cst.c * gamma ** 2 * beta ** 2 * U_B\n", "\n", "\n", "print(electron_energy_loss_rate(1e-5 * u.G, 1 * u.TeV).to(\"erg/s\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now plot it\n", "E_elec = np.logspace(-1.0, 6, 100) * u.MeV\n", "B = 1 * u.G\n", "y = (E_elec / electron_energy_loss_rate(B, E_elec)).to(\"yr\")\n", "plt.loglog(E_elec, y);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A frequent issue is homogeneity. One can use decorators to ensure it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This ensures that B and E are homogeneous to magnetic field strength and energy\n", "# If not will raise a UnitError exception\n", "@u.quantity_input(B=u.T, E=u.J)\n", "def electron_energy_loss_rate(B, E):\n", " \"\"\" energy loss rate of an electron of kinetic energy E in magnetic field B\n", " \"\"\"\n", " U_B = B ** 2 / (2 * cst.mu0)\n", " gamma = (\n", " E / (cst.m_e * cst.c ** 2) + 1\n", " ) # note that this works only because E/(cst.m_e*cst.c**2) is dimensionless\n", " beta = np.sqrt(1 - 1 / gamma ** 2)\n", " return 4.0 / 3.0 * cst.sigma_T * cst.c * gamma ** 2 * beta ** 2 * U_B\n", "\n", "\n", "# Now try it\n", "try:\n", " print(electron_energy_loss_rate(1e-5 * u.G, 1 * u.Hz).to(\"erg/s\"))\n", "except u.UnitsError as message:\n", " print(\"Incorrect unit: \" + str(message))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Coordinates\n", "\n", "Note that SkyCoord are arrays of coordinates. We will see that in more detail in the next section." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Different ways to create a SkyCoord\n", "c1 = SkyCoord(10.625, 41.2, frame=\"icrs\", unit=\"deg\")\n", "c1 = SkyCoord(\"00h42m30s\", \"+41d12m00s\", frame=\"icrs\")\n", "\n", "c2 = SkyCoord(83.633083, 22.0145, unit=\"deg\")\n", "# If you have internet access, you could also use this to define the `source_pos`:\n", "# c2 = SkyCoord.from_name(\"Crab\") # Get the name from CDS\n", "\n", "print(c1.ra, c2.dec)\n", "# separation returns an Angle object\n", "print(\"Distance to Crab: \", c1.separation(c2))\n", "print(\"Distance to Crab: \", c1.separation(c2).degree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Coordinate transformations\n", "\n", "How to change between coordinate frames. The Crab in Galactic coordinates." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c2b = c2.galactic\n", "print(c2b)\n", "print(c2b.l, c2b.b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Time \n", "\n", "Is the Crab visible now?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "now = Time.now()\n", "print(now)\n", "print(now.mjd)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# define the location for the AltAz system\n", "from astropy.coordinates import EarthLocation, AltAz\n", "\n", "paris = EarthLocation(lat=48.8567 * u.deg, lon=2.3508 * u.deg)\n", "\n", "# calculate the horizontal coordinates\n", "crab_altaz = c2.transform_to(AltAz(obstime=now, location=paris))\n", "\n", "print(crab_altaz)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Table: Manipulating the 3FGL catalog\n", "\n", "Here we are going to do some selections with the 3FGL catalog. To do so we use the Table class from astropy.\n", "\n", "### Accessing the table\n", "First, we need to open the catalog in a Table. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Open Fermi 3FGL from the repo\n", "filename = os.environ[\"GAMMAPY_DATA\"] / Path(\n", " \"catalogs/fermi/gll_psc_v16.fit.gz\"\n", ")\n", "table = Table.read(str(filename), hdu=1)\n", "# Alternatively, one can grab it from the server.\n", "# table = Table.read(\"http://fermi.gsfc.nasa.gov/ssc/data/access/lat/4yr_catalog/gll_psc_v16.fit\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note that a single FITS file might contain different tables in different HDUs\n", "filename = os.environ[\"GAMMAPY_DATA\"] / Path(\n", " \"catalogs/fermi/gll_psc_v16.fit.gz\"\n", ")\n", "# You can load a `fits.HDUList` and check the extension names\n", "print([_.name for _ in fits.open(str(filename))])\n", "# Then you can load by name or integer index via the `hdu` option\n", "extended_source_table = Table.read(str(filename), hdu=\"ExtendedSources\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### General informations on the Table\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "table.info()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Statistics on each column\n", "table.info(\"stats\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "### list of column names\n", "table.colnames" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# HTML display\n", "# table.show_in_browser(jsviewer=True)\n", "# table.show_in_notebook(jsviewer=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Accessing the table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The header keywords are stored as a dict\n", "# table.meta\n", "table.meta[\"TSMIN\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# First row\n", "table[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Spectral index of the 5 first entries\n", "table[:5][\"Spectral_Index\"]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Which source has the lowest spectral index?\n", "row = table[np.argmin(table[\"Spectral_Index\"])]\n", "print(\n", " \"Hardest source: \",\n", " row[\"Source_Name\"],\n", " row[\"CLASS1\"],\n", " row[\"Spectral_Index\"],\n", ")\n", "\n", "# Which source has the largest spectral index?\n", "row = table[np.argmax(table[\"Spectral_Index\"])]\n", "print(\n", " \"Softest source: \",\n", " row[\"Source_Name\"],\n", " row[\"CLASS1\"],\n", " row[\"Spectral_Index\"],\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Quantities and SkyCoords from a Table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fluxes = table[\"nuFnu1000_3000\"].quantity\n", "fluxes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "coord = SkyCoord(table[\"GLON\"], table[\"GLAT\"], frame=\"galactic\")\n", "coord.fk5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Selections in a Table\n", "\n", "Here we select Sources according to their class and do some whole sky chart" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get coordinates of FSRQs\n", "fsrq = np.where(\n", " np.logical_or(table[\"CLASS1\"] == \"fsrq \", table[\"CLASS1\"] == \"FSQR \")\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This is here for plotting purpose...\n", "# glon = glon.wrap_at(180*u.degree)\n", "\n", "# Open figure\n", "fig = plt.figure(figsize=(14, 8))\n", "ax = fig.add_subplot(111, projection=\"aitoff\")\n", "ax.scatter(\n", " coord[fsrq].l.wrap_at(180 * u.degree).radian,\n", " coord[fsrq].b.radian,\n", " color=\"k\",\n", " label=\"FSRQ\",\n", ")\n", "ax.grid(True)\n", "ax.legend()\n", "# ax.invert_xaxis() -> This does not work for projections..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Now do it for a series of classes\n", "fig = plt.figure(figsize=(14, 10))\n", "ax = fig.add_subplot(111, projection=\"aitoff\")\n", "\n", "source_classes = [\"\", \"psr\", \"spp\", \"fsrq\", \"bll\", \"bin\"]\n", "\n", "for source_class in source_classes:\n", " # We select elements with correct class in upper or lower characters\n", " index = np.array(\n", " [_.strip().lower() == source_class for _ in table[\"CLASS1\"]]\n", " )\n", "\n", " label = source_class if source_class else \"unid\"\n", "\n", " ax.scatter(\n", " coord[index].l.wrap_at(180 * u.degree).radian,\n", " coord[index].b.radian,\n", " label=label,\n", " )\n", "\n", "ax.grid(True)\n", "ax.legend();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating tables\n", "\n", "A `Table` is basically a dict mapping column names to column values, where a column value is a Numpy array (or Quantity object, which is a Numpy array sub-class). This implies that adding columns to a table after creation is nice and easy, but adding a row is hard and slow, basically all data has to be copied and all objects that make up a Table have to be re-created.\n", "\n", "Here's one way to create a `Table` from scratch: put the data into a list of dicts, and then call the `Table` constructor with the `rows` option." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rows = [dict(a=42, b=\"spam\"), dict(a=43, b=\"ham\")]\n", "my_table = Table(rows=rows)\n", "my_table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Writing tables\n", "\n", "Writing tables to files is easy, you can just give the filename and format you want.\n", "If you run a script repeatedly you might want to add `overwrite=True`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Examples how to write a table in different formats\n", "# Uncomment if you really want to do it\n", "\n", "# my_table.write('/tmp/table.fits', format='fits')\n", "# my_table.write('/tmp/table.fits.gz', format='fits')\n", "# my_table.write('/tmp/table.ecsv', format='ascii.ecsv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "FITS (and some other formats, e.g. HDF5) support writing multiple tables to a single file.\n", "The `table.write` API doesn't support that directly yet.\n", "Here's how you can currently write multiple tables to a FITS file: you have to convert the `astropy.table.Table` objects to `astropy.io.fits.BinTable` objects, and then store them in a `astropy.io.fits.HDUList` objects and call `HDUList.writeto`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_table2 = Table(data=dict(a=[1, 2, 3]))\n", "hdu_list = fits.HDUList(\n", " [\n", " fits.PrimaryHDU(), # need an empty primary HDU\n", " fits.table_to_hdu(my_table),\n", " fits.table_to_hdu(my_table2),\n", " ]\n", ")\n", "# hdu_list.writeto('tables.fits')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tables and pandas\n", "\n", "[pandas](http://pandas.pydata.org/) is one of the most-used packages in the scientific Python stack. Numpy provides the `ndarray` object and functions that operate on `ndarray` objects. Pandas provides the `Dataframe` and `Series` objects, which roughly correspond to the Astropy `Table` and `Column` objects. While both `pandas.Dataframe` and `astropy.table.Table` can often be used to work with tabular data, each has features that the other doesn't. When Astropy was started, it was decided to not base it on `pandas.Dataframe`, but to introduce `Table`, mainly because `pandas.Dataframe` doesn't support multi-dimensional columns, but FITS does and astronomers use sometimes.\n", "\n", "But `pandas.Dataframe` has a ton of features that `Table` doesn't, and is highly optimised, so if you find something to be hard with `Table`, you can convert it to a `Dataframe` and do your work there. As explained in the [interfacing with the pandas package](http://docs.astropy.org/en/stable/table/pandas.html) page in the Astropy docs, it is easy to go back and forth between Table and Dataframe:\n", "\n", " table = Table.from_pandas(dataframe)\n", " dataframe = table.to_pandas()\n", "\n", "Let's try it out with the Fermi-LAT catalog.\n", "\n", "One little trick is needed when converting to a dataframe: we need to drop the multi-dimensional columns that the 3FGL catalog uses for a few columns (flux up/down errors, and lightcurves):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scalar_colnames = tuple(\n", " name for name in table.colnames if len(table[name].shape) <= 1\n", ")\n", "data_frame = table[scalar_colnames].to_pandas()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If you want to have a quick-look at the dataframe:\n", "# data_frame\n", "# data_frame.info()\n", "# data_frame.describe()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Just do demonstrate one of the useful DataFrame methods,\n", "# this is how you can count the number of sources in each class:\n", "data_frame[\"CLASS1\"].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you'd like to learn more about pandas, have a look [here](http://pandas.pydata.org/pandas-docs/stable/10min.html) or [here](http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/03.00-Introduction-to-Pandas.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercises\n", "\n", "- When searched for the hardest and softest sources in 3FGL we did not look at the type of spectrum (PL, ECPL etc), find the hardest and softest PL sources instead. \n", "- Replot the full sky chart of sources in ra-dec instead of galactic coordinates\n", "- Find the 3FGL sources visible from Paris now" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }