{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Astropy: Tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "
\n", "

Objectives

\n", "
\n", "\n", "\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Documentation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more information about the features presented below, you can read the\n", "[astropy.table](http://docs.astropy.org/en/stable/table/index.html) docs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from astropy.table import Table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Creating a table from scratch\n", "t1 = Table()\n", "t1['name'] = ['source 1', 'source 2', 'source 3']\n", "t1['flux'] = [1.2, 2.2, 3.1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Looking at the table\n", "t1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Adding a column\n", "t1['size'] = [1,5,4]\n", "t1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Accessing a column\n", "t1['size']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Converting to a Numpy array\n", "np.array(t1['size'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Accessing a cell\n", "t1['size'][0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Accessing a row\n", "t1[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Units in tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set unit on column\n", "t1['size'].unit = 'cm'\n", "t1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some unitful operations will then work:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t1['size'].to('m')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, you may run into unexpected behavior, so if you are planning on using table columns as Quantities, we recommend that you use the ``QTable`` class:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(t1['size'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from astropy.table import QTable\n", "qt1 = QTable(t1)\n", "type(qt1['size'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "
\n", "

Challenge

\n", "
\n", "\n", "\n", "
\n", "\n", "
    \n", "
  1. Make a table that contains three columns: spectral type, temperature, and radius, and incude 5 rows with fake data (or real data if you like, for example from here). Try including units on the columns that can have them.
  2. \n", "
  3. Find the mean temperature and the maximum radius
  4. \n", "
  5. Try and find out how to add and remove rows
  6. \n", "
  7. Add a new column which gives the luminosity (using $L=4\\pi R^2 \\sigma T^4$)
  8. \n", "
\n", "\n", "
\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#1\n", "from astropy import units as u\n", "t = QTable()\n", "t['spectral type'] = ['O5', 'B5', 'A5', 'F5', 'G5']\n", "t['radius'] = [12, 3.9, 1.7, 1.3, 0.92] * u.R_sun\n", "t['temperature'] = [45000, 15000, 8200, 6400, 5700] * u.K\n", "t" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#2\n", "print('Mean temperature:', np.mean(t['temperature']))\n", "print('Maximum radius:', np.mean(t['radius']))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#3\n", "t.add_row({'spectral type': 'K5',\n", " 'temperature': 4300 * u.K,\n", " 'radius': 0.72 * u.R_sun})\n", "t.remove_row(0)\n", "t" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#4\n", "from numpy import pi\n", "from astropy.constants import sigma_sb\n", "t['luminosity'] = (4 * pi * t['radius'] ** 2 * sigma_sb * t['temperature'] ** 4).to(u.L_sun)\n", "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterating over tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to iterate over rows or over columns. To iterate over rows, simply iterate over the table itself:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for row in t1:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rows can act like dictionaries, so you can access specific columns from a row:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for row in t1:\n", " print(row['name'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterating over columns is also easy:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for colname in t1.columns:\n", " column = t1[colname]\n", " print(column)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Accessing specific rows from a column object can also be done with the item notation:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for colname in t1.columns:\n", " column = t1[colname]\n", " print(column[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Joining tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from astropy.table import join" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t2 = Table()\n", "t2['name'] = ['source 1', 'source 3']\n", "t2['flux2'] = [1,9]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t3 = join(t1, t2, join_type='outer')\n", "t3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.mean(t3['flux2'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Masked tables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t4 = Table(masked=True)\n", "t4['name'] = ['source 1', 'source 2', 'source 3']\n", "t4['flux'] = [1.2, 2.2, 3.1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t4['flux'].mask = [1,0,1]\n", "t4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Slicing\n", "\n", "Tables can be sliced like Numpy arrays:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs = Table.read(\"\"\"name obs_date mag_b mag_v\n", " M31 2012-01-02 17.0 17.5\n", " M31 2012-01-02 17.1 17.4\n", " M101 2012-01-02 15.1 13.5\n", " M82 2012-02-14 16.2 14.5\n", " M31 2012-02-14 16.9 17.3\n", " M82 2012-02-14 15.2 15.5\n", " M101 2012-02-14 15.0 13.6\n", " M82 2012-03-26 15.7 16.5\n", " M101 2012-03-26 15.1 13.5\n", " M101 2012-03-26 14.8 14.3\n", " \"\"\", format='ascii')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs[1:4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs[obs['mag_b'] > 16]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs['mag_b', 'mag_v']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "
\n", "

Challenge

\n", "
\n", "\n", "\n", "
\n", "\n", "

Starting from the obs table:

\n", "
    \n", "
  1. Make a new table that shows every other row, starting with the second row? (that is, the second, fourth, sixth, etc. rows).
  2. \n", "
  3. Make a new table the only contains rows where name is M31
  4. \n", "
\n", "\n", "
\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#1\n", "subset1 = obs[1::2]\n", "subset1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#2\n", "subset2 = obs[obs['name'] == 'M31']\n", "subset2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Grouping and Aggregation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs_by_name = obs.group_by('name')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs_by_name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for group in obs_by_name.groups:\n", " print(group)\n", " print(\"\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs_by_name.groups.aggregate(np.mean)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs.write('test.fits', overwrite=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obs.write('test.vot', format='votable', overwrite=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t4 = Table.read('2mass.tbl', format='ascii.ipac')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "
\n", "

Challenge

\n", "
\n", "\n", "\n", "
\n", "\n", "

Using the t4 table above:

\n", "
    \n", "
  1. \n", "

    Make a plot that shows j_m-h_m on the x-axis, and h_m-k_m on the y-axis

    \n", "
  2. \n", "
  3. \n", "

    Make a new table that contains the subset of rows where the j_snr, h_snr, and k_snr columns, which give the signal-to-noise-ratio in the J, H, and K band, are greater than 10, and try and show these points in red in the plot you just made.

    \n", "
  4. \n", "
  5. \n", "

    Make a new table (based on the full table) that contains only the RA, Dec, and the j_m, h_m and k_m columns, then try and write out this catalog into a format that you can read into another software package. For example, try and write out the catalog into CSV format, then read it into a spreadsheet software package (e.g. Excel, Google Docs, Numbers, OpenOffice). You may run into an issue at this point - if so, take a look at https://github.com/astropy/astropy/issues/7357 to see how to fix it.

    \n", "
  6. \n", "
\n", "\n", "
\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#1\n", "import matplotlib.pyplot as plt\n", "plt.scatter(t4['j_m'] - t4['h_m'], t4['h_m'] - t4['k_m'], )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#2\n", "subset = t4[(t4['j_snr'] > 10) & (t4['h_snr'] > 10) & (t4['k_snr'] > 10)]\n", "subset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#2 (continued)\n", "import matplotlib.pyplot as plt\n", "plt.scatter(t4['j_m'] - t4['h_m'],\n", " t4['h_m'] - t4['k_m'],\n", " s=5, color='black')\n", "plt.scatter(subset['j_m'] - subset['h_m'],\n", " subset['h_m'] - subset['k_m'],\n", " s=30, color='red', alpha=0.5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#3\n", "simple = t4['ra', 'dec', 'j_m', 'h_m', 'k_m']\n", "simple" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#3 (continued)\n", "simple.write('2mass_subset.csv', format='ascii.csv', overwrite=True, comment='#')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }