{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Astropy: Tables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<section class=\"objectives panel panel-warning\">\n",
    "<div class=\"panel-heading\">\n",
    "<h2><span class=\"fa fa-certificate\"></span> Objectives</h2>\n",
    "</div>\n",
    "\n",
    "\n",
    "<div class=\"panel-body\">\n",
    "\n",
    "<ul>\n",
    "<li>Create tables</li>\n",
    "<li>Access data in tables</li>\n",
    "<li>Combining tables</li>\n",
    "<li>Aggregation</li>\n",
    "<li>Masking</li>\n",
    "<li>Reading/writing</li>\n",
    "</ul>\n",
    "\n",
    "</div>\n",
    "\n",
    "</section>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Documentation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more information about the features presented below, you can read the\n",
    "[astropy.table](http://docs.astropy.org/en/stable/table/index.html) docs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from astropy.table import Table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Creating a table from scratch\n",
    "t1 = Table()\n",
    "t1['name'] = ['source 1', 'source 2', 'source 3']\n",
    "t1['flux'] = [1.2, 2.2, 3.1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Looking at the table\n",
    "t1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Adding a column\n",
    "t1['size'] = [1,5,4]\n",
    "t1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Accessing a column\n",
    "t1['size']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Converting to a Numpy array\n",
    "np.array(t1['size'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Accessing a cell\n",
    "t1['size'][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Accessing a row\n",
    "t1[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Units in tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set unit on column\n",
    "t1['size'].unit = 'cm'\n",
    "t1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Some unitful operations will then work:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t1['size'].to('m')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, you may run into unexpected behavior, so if you are planning on using table columns as Quantities, we recommend that you use the ``QTable`` class:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "type(t1['size'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astropy.table import QTable\n",
    "qt1 = QTable(t1)\n",
    "type(qt1['size'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<section class=\"challenge panel panel-success\">\n",
    "<div class=\"panel-heading\">\n",
    "<h2><span class=\"fa fa-pencil\"></span> Challenge</h2>\n",
    "</div>\n",
    "\n",
    "\n",
    "<div class=\"panel-body\">\n",
    "\n",
    "<ol>\n",
    "<li>Make a table that contains three columns: <code>spectral type</code>, <code>temperature</code>, and <code>radius</code>, and incude 5 rows with fake data (or real data if you like, for example from <a href=\"http://www.atlasoftheuniverse.com/startype.html\">here</a>). Try including units on the columns that can have them.</li>\n",
    "<li>Find the mean temperature and the maximum radius</li>\n",
    "<li>Try and find out how to add and remove rows</li>\n",
    "<li>Add a new column which gives the luminosity (using $L=4\\pi R^2 \\sigma T^4$)</li>\n",
    "</ol>\n",
    "\n",
    "</div>\n",
    "\n",
    "</section>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#1\n",
    "from astropy import units as u\n",
    "t = QTable()\n",
    "t['spectral type'] = ['O5', 'B5', 'A5', 'F5', 'G5']\n",
    "t['radius'] = [12, 3.9, 1.7, 1.3, 0.92] * u.R_sun\n",
    "t['temperature'] = [45000, 15000, 8200, 6400, 5700] * u.K\n",
    "t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#2\n",
    "print('Mean temperature:', np.mean(t['temperature']))\n",
    "print('Maximum radius:', np.mean(t['radius']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#3\n",
    "t.add_row({'spectral type': 'K5',\n",
    "           'temperature': 4300 * u.K,\n",
    "           'radius': 0.72 * u.R_sun})\n",
    "t.remove_row(0)\n",
    "t"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#4\n",
    "from numpy import pi\n",
    "from astropy.constants import sigma_sb\n",
    "t['luminosity'] = (4 * pi * t['radius'] ** 2 * sigma_sb * t['temperature'] ** 4).to(u.L_sun)\n",
    "t"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Iterating over tables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is possible to iterate over rows or over columns. To iterate over rows, simply iterate over the table itself:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for row in t1:\n",
    "    print(row)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Rows can act like dictionaries, so you can access specific columns from a row:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for row in t1:\n",
    "    print(row['name'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Iterating over columns is also easy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for colname in t1.columns:\n",
    "    column = t1[colname]\n",
    "    print(column)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Accessing specific rows from a column object can also be done with the item notation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for colname in t1.columns:\n",
    "    column = t1[colname]\n",
    "    print(column[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Joining tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from astropy.table import join"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t2 = Table()\n",
    "t2['name'] = ['source 1', 'source 3']\n",
    "t2['flux2'] = [1,9]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t3 = join(t1, t2, join_type='outer')\n",
    "t3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.mean(t3['flux2'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Masked tables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t4 = Table(masked=True)\n",
    "t4['name'] = ['source 1', 'source 2', 'source 3']\n",
    "t4['flux'] = [1.2, 2.2, 3.1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t4['flux'].mask = [1,0,1]\n",
    "t4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Slicing\n",
    "\n",
    "Tables can be sliced like Numpy arrays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs = Table.read(\"\"\"name    obs_date    mag_b  mag_v\n",
    "                    M31     2012-01-02  17.0   17.5\n",
    "                    M31     2012-01-02  17.1   17.4\n",
    "                    M101    2012-01-02  15.1   13.5\n",
    "                    M82     2012-02-14  16.2   14.5\n",
    "                    M31     2012-02-14  16.9   17.3\n",
    "                    M82     2012-02-14  15.2   15.5\n",
    "                    M101    2012-02-14  15.0   13.6\n",
    "                    M82     2012-03-26  15.7   16.5\n",
    "                    M101    2012-03-26  15.1   13.5\n",
    "                    M101    2012-03-26  14.8   14.3\n",
    "                    \"\"\", format='ascii')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs[1:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs[obs['mag_b'] > 16]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs['mag_b', 'mag_v']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<section class=\"challenge panel panel-success\">\n",
    "<div class=\"panel-heading\">\n",
    "<h2><span class=\"fa fa-pencil\"></span> Challenge</h2>\n",
    "</div>\n",
    "\n",
    "\n",
    "<div class=\"panel-body\">\n",
    "\n",
    "<p>Starting from the <code>obs</code> table:</p>\n",
    "<ol>\n",
    "<li>Make a new table that shows every other row, starting with the second row? (that is, the second, fourth, sixth, etc. rows).</li>\n",
    "<li>Make a new table the only contains rows where <code>name</code> is <code>M31</code></li>\n",
    "</ol>\n",
    "\n",
    "</div>\n",
    "\n",
    "</section>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#1\n",
    "subset1 = obs[1::2]\n",
    "subset1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#2\n",
    "subset2 = obs[obs['name'] == 'M31']\n",
    "subset2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Grouping and Aggregation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs_by_name = obs.group_by('name')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs_by_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for group in obs_by_name.groups:\n",
    "    print(group)\n",
    "    print(\"\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs_by_name.groups.aggregate(np.mean)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs.write('test.fits', overwrite=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "obs.write('test.vot', format='votable', overwrite=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reading data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t4 = Table.read('2mass.tbl', format='ascii.ipac')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<section class=\"challenge panel panel-success\">\n",
    "<div class=\"panel-heading\">\n",
    "<h2><span class=\"fa fa-pencil\"></span> Challenge</h2>\n",
    "</div>\n",
    "\n",
    "\n",
    "<div class=\"panel-body\">\n",
    "\n",
    "<p>Using the <code>t4</code> table above:</p>\n",
    "<ol>\n",
    "<li>\n",
    "<p>Make a plot that shows <code>j_m</code>-<code>h_m</code> on the x-axis, and <code>h_m</code>-<code>k_m</code> on the y-axis</p>\n",
    "</li>\n",
    "<li>\n",
    "<p>Make a new table that contains the subset of rows where the <code>j_snr</code>, <code>h_snr</code>, and <code>k_snr</code> columns, which give the signal-to-noise-ratio in the J, H, and K band, are greater than 10, and try and show these points in red in the plot you just made.</p>\n",
    "</li>\n",
    "<li>\n",
    "<p>Make a new table (based on the full table) that contains only the RA, Dec, and the <code>j_m</code>, <code>h_m</code> and <code>k_m</code> columns, then try and write out this catalog into a format that you can read into another software package. For example, try and write out the catalog into CSV format, then read it into a spreadsheet software package (e.g. Excel, Google Docs, Numbers, OpenOffice). You may run into an issue at this point - if so, take a look at https://github.com/astropy/astropy/issues/7357 to see how to fix it.</p>\n",
    "</li>\n",
    "</ol>\n",
    "\n",
    "</div>\n",
    "\n",
    "</section>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#1\n",
    "import matplotlib.pyplot as plt\n",
    "plt.scatter(t4['j_m'] - t4['h_m'], t4['h_m'] - t4['k_m'], )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#2\n",
    "subset = t4[(t4['j_snr'] > 10) & (t4['h_snr'] > 10) & (t4['k_snr'] > 10)]\n",
    "subset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#2 (continued)\n",
    "import matplotlib.pyplot as plt\n",
    "plt.scatter(t4['j_m'] - t4['h_m'],\n",
    "            t4['h_m'] - t4['k_m'],\n",
    "            s=5, color='black')\n",
    "plt.scatter(subset['j_m'] - subset['h_m'],\n",
    "            subset['h_m'] - subset['k_m'],\n",
    "            s=30, color='red', alpha=0.5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#3\n",
    "simple = t4['ra', 'dec', 'j_m', 'h_m', 'k_m']\n",
    "simple"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#3 (continued)\n",
    "simple.write('2mass_subset.csv', format='ascii.csv', overwrite=True, comment='#')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}