{ "nbformat_minor": 0, "nbformat": 4, "cells": [ { "source": [ "# Reading data into Astropy Tables\n", "\n", "## Objectives\n", "\n", " - Read ASCII files with a defined format\n", " - Learn basic operations with `astropy.tables`\n", " - Ingest header information\n", " - VOTables" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Reading data\n", "\n", "Our first task with python was to read a `csv` file using `np.loadtxt()`.\n", "That function has few properties to define the dlimiter of the columns, skip rows, read commented lines, convert values while reading, etc.\n", "\n", "However, the result is an array, without the information of the metadata that file may have included (name, units, ...).\n", "\n", "Astropy offers a ascii reader that improves many of these steps while provides templates to read common ascii files in astronomy.\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": 1, "cell_type": "code", "source": [ "from astropy.io import ascii" ], "outputs": [], "metadata": { "collapsed": true, "keep": true } }, { "execution_count": 2, "cell_type": "code", "source": [ "# Read a sample file: sources.dat\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Automatically, read has identified the header and the format of each column. The result is a `Table` object, and that brings some additional properties." ], "cell_type": "markdown", "metadata": {} }, { "execution_count": 3, "cell_type": "code", "source": [ "# Show the info of the data read\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 4, "cell_type": "code", "source": [ "# Get the name of the columns\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 5, "cell_type": "code", "source": [ "# Get just the values of a particular column\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 6, "cell_type": "code", "source": [ "# get the first element\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Astropy [can read a variety of formats](http://astropy.readthedocs.org/en/stable/io/ascii/index.html#supported-formats) easily.\n", "The following example uses a quite " ], "cell_type": "markdown", "metadata": {} }, { "execution_count": 7, "cell_type": "code", "source": [ "# Read the data from the source\n", "table = ascii.read(\"ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/snrs.dat\",\n", " readme=\"ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/ReadMe\")" ], "outputs": [], "metadata": { "collapsed": false, "keep": true } }, { "execution_count": 8, "cell_type": "code", "source": [ "# See the stats of the table\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 9, "cell_type": "code", "source": [ "# If we want to see the first 10 entries\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 10, "cell_type": "code", "source": [ "# the units are also stored, we can extract them too\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 11, "cell_type": "code", "source": [ "# Adding values of different columns\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 12, "cell_type": "code", "source": [ "# adding values of different columns but being aware of column units\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 13, "cell_type": "code", "source": [ "# Create a new column in the table\n" ], "outputs": [], "metadata": { "collapsed": true } }, { "execution_count": 14, "cell_type": "code", "source": [ "# Show table's new column \n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 15, "cell_type": "code", "source": [ "# add a description to the new column\n" ], "outputs": [], "metadata": { "collapsed": true } }, { "execution_count": 16, "cell_type": "code", "source": [ "# Now it does show the values\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 17, "cell_type": "code", "source": [ "# Using numpy to calculate the sin of the RA\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 18, "cell_type": "code", "source": [ "# Let's change the units...\n" ], "outputs": [], "metadata": { "collapsed": true } }, { "execution_count": 19, "cell_type": "code", "source": [ "# does the sin now works?\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "## Properties when reading\n", "\n", "the reading of the table has many properties, let's imagine the following easy example:" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": 20, "cell_type": "code", "source": [ "weather_data = \"\"\"\n", "# Country = Finland\n", "# City = Helsinki\n", "# Longitud = 24.9375\n", "# Latitud = 60.170833\n", "# Week = 32\n", "# Year = 2015\n", "day, precip, type\n", "Mon,1.5,rain\n", "Tues,,\n", "Wed,1.1,snow\n", "Thur,2.3,rain\n", "Fri,0.2,\n", "Sat,1.1,snow\n", "Sun,5.4,snow\n", "\"\"\"" ], "outputs": [], "metadata": { "collapsed": true, "keep": true } }, { "execution_count": 21, "cell_type": "code", "source": [ "# Read the table\n" ], "outputs": [], "metadata": { "collapsed": true } }, { "execution_count": 22, "cell_type": "code", "source": [ "# Blank values are interpreted by default as bad/missing values\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 23, "cell_type": "code", "source": [ "# Let's define missing values for the columns we want:\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 24, "cell_type": "code", "source": [ "# Use filled to show the value filled.\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 25, "cell_type": "code", "source": [ "# We can see the meta as a dictionary, but not as key, value pairs\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 26, "cell_type": "code", "source": [ "# To get it the header as a table\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "When the values are not empty, then the keyword `fill_values` on `read` has [to be used](http://astropy.readthedocs.org/en/stable/io/ascii/read.html#bad-or-missing-values).\n" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Reading VOTables\n", "\n", "VOTables are an special type of tables which should be self-consistent and can be tied to a particular scheme.\n", "This mean the file will contain where the data comes from (and which query produced it) and the properties for each field, making it easier to ingest by a machine." ], "cell_type": "markdown", "metadata": { "collapsed": false } }, { "execution_count": 27, "cell_type": "code", "source": [ "from astropy.io.votable import parse_single_table" ], "outputs": [], "metadata": { "collapsed": false, "keep": true } }, { "execution_count": 28, "cell_type": "code", "source": [ "# Read the example table from HELIO (hfc_ar.xml)\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 29, "cell_type": "code", "source": [ "# See the fields of the table\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 30, "cell_type": "code", "source": [ "# extract one (NOAA_NUMBER) or all of the columns\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 31, "cell_type": "code", "source": [ "# Show the data\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 32, "cell_type": "code", "source": [ "# See the mask\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 33, "cell_type": "code", "source": [ "# Shee the whole array.\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 34, "cell_type": "code", "source": [ "# Convert the table to an astropy table\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 35, "cell_type": "code", "source": [ "# See the table\n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 36, "cell_type": "code", "source": [ "# Different results because quantities are not \n" ], "outputs": [], "metadata": { "collapsed": false } }, { "execution_count": 37, "cell_type": "code", "source": [ "# And it can also be converted to other units\n" ], "outputs": [], "metadata": { "collapsed": false } } ], "metadata": { "kernelspec": { "display_name": "Python 2", "name": "python2", "language": "python2" }, "language_info": { "mimetype": "text/x-python", "nbconvert_exporter": "python", "name": "python", "file_extension": ".py", "version": "2.7.11", "pygments_lexer": "ipython2", "codemirror_mode": { "version": 2, "name": "ipython" } } } }