{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reading data into Astropy Tables\n",
"\n",
"## Objectives\n",
"\n",
" - Read ASCII files with a defined format\n",
" - Learn basic operations with `astropy.tables`\n",
" - Ingest header information\n",
" - VOTables"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reading data\n",
"\n",
"Our first task with python was to read a `csv` file using `np.loadtxt()`.\n",
"That function has few properties to define the dlimiter of the columns, skip rows, read commented lines, convert values while reading, etc.\n",
"\n",
"However, the result is an array, without the information of the metadata that file may have included (name, units, ...).\n",
"\n",
"Astropy offers a ascii reader that improves many of these steps while provides templates to read common ascii files in astronomy.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true,
"keep": true
},
"outputs": [],
"source": [
"from astropy.io import ascii"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Table length=2>\n",
"
\n",
"obsid | redshift | X | Y | object |
\n",
"int64 | float64 | int64 | int64 | str11 |
\n",
"3102 | 0.32 | 4167 | 4085 | Q1250+568-A |
\n",
"877 | 0.22 | 4378 | 3892 | Source 82 |
\n",
"
"
],
"text/plain": [
"\n",
"obsid redshift X Y object \n",
"int64 float64 int64 int64 str11 \n",
"----- -------- ----- ----- -----------\n",
" 3102 0.32 4167 4085 Q1250+568-A\n",
" 877 0.22 4378 3892 Source 82"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Read a sample file: sources.dat\n",
"data = ascii.read(\"sources.dat\")\n",
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Automatically, read has identified the header and the format of each column. The result is a `Table` object, and that brings some additional properties."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/dvd/.conda/envs/swc/lib/python2.7/site-packages/astropy/table/column.py:268: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
" return self.data.__eq__(other)\n"
]
},
{
"data": {
"text/plain": [
"\n",
" name dtype \n",
"-------- -------\n",
" obsid int64\n",
"redshift float64\n",
" X int64\n",
" Y int64\n",
" object str11"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Show the info of the data read\n",
"data.info"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"['obsid', 'redshift', 'X', 'Y', 'object']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get the name of the columns\n",
"data.colnames"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Column name='obsid' dtype='int64' length=2>\n",
"\n",
"3102 |
\n",
"877 |
\n",
"
"
],
"text/plain": [
"\n",
"3102\n",
" 877"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get just the values of a particular column\n",
"data['obsid']"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Row index=0>\n",
"\n",
"obsid | redshift |
\n",
"int64 | float64 |
\n",
"3102 | 0.32 |
\n",
"
"
],
"text/plain": [
"\n",
"obsid redshift\n",
"int64 float64 \n",
"----- --------\n",
" 3102 0.32"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# get the first element\n",
"data['obsid', 'redshift'][0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Astropy [can read a variety of formats](http://astropy.readthedocs.org/en/stable/io/ascii/index.html#supported-formats) easily.\n",
"The following example uses a quite "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"keep": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/snrs.dat [Done]\n",
"Downloading ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/ReadMe [Done]\n"
]
}
],
"source": [
"# Read the data from the source\n",
"table = ascii.read(\"ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/snrs.dat\",\n",
" readme=\"ftp://cdsarc.u-strasbg.fr/pub/cats/VII/253/ReadMe\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" name mean std min max n_bad\n",
"---------- -------------- -------------- --- ------ -----\n",
" SNR -- -- -- -- 0\n",
" RAh 16.0547445255 4.15229196762 0 23 0\n",
" RAm 28.8576642336 16.9123382708 0 59 0\n",
" RAs 28.102189781 18.5923556505 0 59 0\n",
" DE- -- -- -- -- 0\n",
" DEd 33.602189781 19.4333634671 0 72 0\n",
" DEm 29.6459854015 17.7768672558 0 59 0\n",
" MajDiam 30.9124087591 42.1254815567 1.5 310.0 0\n",
" --- -- -- -- -- 164\n",
" MinDiam 23.4909090909 33.3816758266 2.0 240.0 164\n",
" u_MinDiam -- -- -- -- 243\n",
" type -- -- -- -- 0\n",
" l_S(1GHz) -- -- -- -- 270\n",
" S(1GHz) 42.6488549618 212.136906631 0.3 2720.0 12\n",
" u_S(1GHz) -- -- -- -- 157\n",
" Sp-Index 0.486981132075 0.158350398486 0.0 1.0 62\n",
"u_Sp-Index -- -- -- -- 154\n",
" Names -- -- -- -- 194\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/dvd/.conda/envs/swc/lib/python2.7/site-packages/astropy/table/info.py:94: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal\n",
" if np.all(info[name] == ''):\n"
]
}
],
"source": [
"# See the stats of the table\n",
"table.info('stats')"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Table masked=True length=10>\n",
"\n",
"SNR | RAh | RAm | RAs | DE- | DEd | DEm | MajDiam | --- | MinDiam | u_MinDiam | type | l_S(1GHz) | S(1GHz) | u_S(1GHz) | Sp-Index | u_Sp-Index | Names |
\n",
" | h | min | s | | deg | arcmin | arcmin | | arcmin | | | | Jy | | | | |
\n",
"str11 | int64 | int64 | int64 | str1 | int64 | int64 | float64 | str1 | float64 | str1 | str2 | str1 | float64 | str1 | float64 | str1 | str26 |
\n",
"G000.0+00.0 | 17 | 45 | 44 | - | 29 | 0 | 3.5 | x | 2.5 | -- | S | -- | 100.0 | ? | 0.8 | ? | Sgr A East |
\n",
"G000.3+00.0 | 17 | 46 | 15 | - | 28 | 38 | 15.0 | x | 8.0 | -- | S | -- | 22.0 | -- | 0.6 | -- | -- |
\n",
"G000.9+00.1 | 17 | 47 | 21 | - | 28 | 9 | 8.0 | -- | -- | -- | C | -- | 18.0 | ? | -- | v | -- |
\n",
"G001.0-00.1 | 17 | 48 | 30 | - | 28 | 9 | 8.0 | -- | -- | -- | S | -- | 15.0 | -- | 0.6 | ? | -- |
\n",
"G001.4-00.1 | 17 | 49 | 39 | - | 27 | 46 | 10.0 | -- | -- | -- | S | -- | 2.0 | ? | -- | ? | -- |
\n",
"G001.9+00.3 | 17 | 48 | 45 | - | 27 | 10 | 1.5 | -- | -- | -- | S | -- | 0.6 | -- | 0.6 | -- | -- |
\n",
"G003.7-00.2 | 17 | 55 | 26 | - | 25 | 50 | 14.0 | x | 11.0 | -- | S | -- | 2.3 | -- | 0.65 | -- | -- |
\n",
"G003.8+00.3 | 17 | 52 | 55 | - | 25 | 28 | 18.0 | -- | -- | -- | S? | -- | 3.0 | ? | 0.6 | -- | -- |
\n",
"G004.2-03.5 | 18 | 8 | 55 | - | 27 | 3 | 28.0 | -- | -- | -- | S | -- | 3.2 | ? | 0.6 | ? | -- |
\n",
"G004.5+06.8 | 17 | 30 | 42 | - | 21 | 29 | 3.0 | -- | -- | -- | S | -- | 19.0 | -- | 0.64 | -- | Kepler, SN1604, 3C358 |
\n",
"
"
],
"text/plain": [
"\n",
" SNR RAh RAm RAs ... Sp-Index u_Sp-Index Names \n",
" h min s ... \n",
" str11 int64 int64 int64 ... float64 str1 str26 \n",
"----------- ----- ----- ----- ... -------- ---------- ---------------------\n",
"G000.0+00.0 17 45 44 ... 0.8 ? Sgr A East\n",
"G000.3+00.0 17 46 15 ... 0.6 -- --\n",
"G000.9+00.1 17 47 21 ... -- v --\n",
"G001.0-00.1 17 48 30 ... 0.6 ? --\n",
"G001.4-00.1 17 49 39 ... -- ? --\n",
"G001.9+00.3 17 48 45 ... 0.6 -- --\n",
"G003.7-00.2 17 55 26 ... 0.65 -- --\n",
"G003.8+00.3 17 52 55 ... 0.6 -- --\n",
"G004.2-03.5 18 8 55 ... 0.6 ? --\n",
"G004.5+06.8 17 30 42 ... 0.64 -- Kepler, SN1604, 3C358"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# If we want to see the first 10 entries\n",
"table[0:10]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/latex": [
"$[0.0010181087,~0.0043633231,~0.0023271057] \\; \\mathrm{rad}$"
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# the units are also stored, we can extract them too\n",
"table['MajDiam'].quantity.to('rad')[0:3]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<MaskedColumn name='RAh' dtype='int64' unit='h' description='*Right Ascension J2000 hours' length=3>\n",
"\n",
"106 |
\n",
"78 |
\n",
"85 |
\n",
"
"
],
"text/plain": [
"\n",
"106\n",
" 78\n",
" 85"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Adding values of different columns\n",
"(table['RAh'] + table['RAm'] + table['RAs'])[0:3]\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/latex": [
"$[17.762222,~17.770833,~17.789167] \\; \\mathrm{h}$"
],
"text/plain": [
""
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# adding values of different columns but being aware of column units\n",
"(table['RAh'].quantity + table['RAm'].quantity + table['RAs'].quantity)[0:3]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Create a new column in the table\n",
"table['RA'] = table['RAh'].quantity + table['RAm'].quantity + table['RAs'].quantity"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<MaskedColumn name='RA' dtype='float64' unit='h' length=3>\n",
"\n",
"17.7622222222 |
\n",
"17.7708333333 |
\n",
"17.7891666667 |
\n",
"
"
],
"text/plain": [
"\n",
"17.7622222222\n",
"17.7708333333\n",
"17.7891666667"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Show table's new column \n",
"table['RA'][0:3]"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# add a description to the new column\n",
"table['RA'].description = table['RAh'].description"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<MaskedColumn name='RA' dtype='float64' unit='h' description='*Right Ascension J2000 hours' length=3>\n",
"\n",
"17.7622222222 |
\n",
"17.7708333333 |
\n",
"17.7891666667 |
\n",
"
"
],
"text/plain": [
"\n",
"17.7622222222\n",
"17.7708333333\n",
"17.7891666667"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Now it does show the values\n",
"table['RA'][0:3]\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"ename": "TypeError",
"evalue": "Can only apply 'sin' function to quantities with angle units",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mnumpy\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msin\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtable\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'RA'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mquantity\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;32m/home/dvd/.conda/envs/swc/lib/python2.7/site-packages/astropy/units/quantity.pyc\u001b[0m in \u001b[0;36m__array_prepare__\u001b[1;34m(self, obj, context)\u001b[0m\n\u001b[0;32m 321\u001b[0m \u001b[1;31m# the unit the output from the ufunc will have.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 322\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mfunction\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mUFUNC_HELPERS\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 323\u001b[1;33m \u001b[0mconverters\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mresult_unit\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mUFUNC_HELPERS\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mfunction\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfunction\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m*\u001b[0m\u001b[0munits\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 324\u001b[0m \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 325\u001b[0m raise TypeError(\"Unknown ufunc {0}. Please raise issue on \"\n",
"\u001b[1;32m/home/dvd/.conda/envs/swc/lib/python2.7/site-packages/astropy/units/quantity_helper.pyc\u001b[0m in \u001b[0;36mhelper_radian_to_dimensionless\u001b[1;34m(f, unit)\u001b[0m\n\u001b[0;32m 174\u001b[0m raise TypeError(\"Can only apply '{0}' function to \"\n\u001b[0;32m 175\u001b[0m \u001b[1;34m\"quantities with angle units\"\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 176\u001b[1;33m .format(f.__name__))\n\u001b[0m\u001b[0;32m 177\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 178\u001b[0m \u001b[0mUFUNC_HELPERS\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcos\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mhelper_radian_to_dimensionless\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mTypeError\u001b[0m: Can only apply 'sin' function to quantities with angle units"
]
}
],
"source": [
"# Using numpy to calculate the sin of the RA\n",
"import numpy as np\n",
"np.sin(table['RA'].quantity)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Let's change the units...\n",
"import astropy.units as u\n",
"table['RA'].unit = u.hourangle"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/latex": [
"$[-0.99806309,~-0.9982008,~-0.99847709,~\\dots, -0.99835018,~-0.99799924,~-0.99604107] \\; \\mathrm{}$"
],
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# does the sin now works?\n",
"np.sin(table['RA'].quantity)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Properties when reading\n",
"\n",
"the reading of the table has many properties, let's imagine the following easy example:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": true,
"keep": true
},
"outputs": [],
"source": [
"weather_data = \"\"\"\n",
"# Country = Finland\n",
"# City = Helsinki\n",
"# Longitud = 24.9375\n",
"# Latitud = 60.170833\n",
"# Week = 32\n",
"# Year = 2015\n",
"day, precip, type\n",
"Mon,1.5,rain\n",
"Tues,,\n",
"Wed,1.1,snow\n",
"Thur,2.3,rain\n",
"Fri,0.2,\n",
"Sat,1.1,snow\n",
"Sun,5.4,snow\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Read the table\n",
"weather = ascii.read(weather_data)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
" name mean std min max n_bad\n",
"------ ------------- ------------- --- --- -----\n",
" day -- -- -- -- 0\n",
"precip 1.93333333333 1.66999667332 0.2 5.4 1\n",
" type -- -- -- -- 2\n"
]
}
],
"source": [
"# Blank values are interpreted by default as bad/missing values\n",
"weather.info('stats')"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Let's define missing values for the columns we want:\n",
"weather['type'].fill_value = 'N/A'\n",
"weather['precip'].fill_value = -999"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Table length=7>\n",
"\n",
"day | precip | type |
\n",
"str4 | float64 | str4 |
\n",
"Mon | 1.5 | rain |
\n",
"Tues | -999.0 | N/A |
\n",
"Wed | 1.1 | snow |
\n",
"Thur | 2.3 | rain |
\n",
"Fri | 0.2 | N/A |
\n",
"Sat | 1.1 | snow |
\n",
"Sun | 5.4 | snow |
\n",
"
"
],
"text/plain": [
"\n",
"day precip type\n",
"str4 float64 str4\n",
"---- ------- ----\n",
" Mon 1.5 rain\n",
"Tues -999.0 N/A\n",
" Wed 1.1 snow\n",
"Thur 2.3 rain\n",
" Fri 0.2 N/A\n",
" Sat 1.1 snow\n",
" Sun 5.4 snow"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Use filled to show the value filled.\n",
"weather.filled()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([('comments',\n",
" ['Country = Finland',\n",
" 'City = Helsinki',\n",
" 'Longitud = 24.9375',\n",
" 'Latitud = 60.170833',\n",
" 'Week = 32',\n",
" 'Year = 2015'])])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We can see the meta as a dictionary, but not as key, value pairs\n",
"weather.meta"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" key val \n",
"-------- ---------\n",
" Country Finland\n",
" City Helsinki\n",
"Longitud 24.9375\n",
" Latitud 60.170833\n",
" Week 32\n",
" Year 2015\n"
]
}
],
"source": [
"# To get it the header as a table\n",
"header = ascii.read(weather.meta['comments'], delimiter='=',\n",
" format='no_header', names=['key', 'val'])\n",
"print(header)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When the values are not empty, then the keyword `fill_values` on `read` has [to be used](http://astropy.readthedocs.org/en/stable/io/ascii/read.html#bad-or-missing-values).\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"## Reading VOTables\n",
"\n",
"VOTables are an special type of tables which should be self-consistent and can be tied to a particular scheme.\n",
"This mean the file will contain where the data comes from (and which query produced it) and the properties for each field, making it easier to ingest by a machine."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false,
"keep": true
},
"outputs": [],
"source": [
"from astropy.io.votable import parse_single_table"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: W03: hfc_ar.xml:10:0: W03: Implicitly generating an ID from a name 'OBSERVATORY,VIEW_AR_GUI' -> 'OBSERVATORY_VIEW_AR_GUI' [astropy.io.votable.exceptions]\n",
"WARNING:astropy:W03: hfc_ar.xml:10:0: W03: Implicitly generating an ID from a name 'OBSERVATORY,VIEW_AR_GUI' -> 'OBSERVATORY_VIEW_AR_GUI'\n"
]
}
],
"source": [
"# Read the example table from HELIO (hfc_ar.xml)\n",
"table = parse_single_table(\"hfc_ar.xml\")"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[,\n",
" ,\n",
" ,\n",
" ,\n",
" ,\n",
" ,\n",
" ,\n",
" ,\n",
" ,\n",
" ]"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# See the fields of the table\n",
"table.fields"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# extract one (NOAA_NUMBER) or all of the columns\n",
"NOAA = table.array['NOAA_NUMBER']"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 10321, -2147483648, 10325, ..., -2147483648,\n",
" 10332, -2147483648], dtype=int32)"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Show the data\n",
"NOAA.data"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([False, True, False, ..., True, False, True], dtype=bool)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# See the mask\n",
"NOAA.mask"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"masked_array(data = [10321 -- 10325 ..., -- 10332 --],\n",
" mask = [False True False ..., True False True],\n",
" fill_value = 999999)"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Shee the whole array.\n",
"NOAA"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Convert the table to an astropy table\n",
"asttable = table.to_table()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<Table masked=True length=3037>\n",
"\n",
"ID_AR | DATE_OBS | NOAA_NUMBER | FEAT_HG_LAT_DEG | FEAT_AREA_DEG2 | FEAT_X_ARCSEC | FEAT_Y_ARCSEC | FEAT_MAX_INT | FEAT_MIN_INT | FEAT_MEAN_INT |
\n",
" | | | deg | deg2 | arcs | arcs | gauss | gauss | gauss |
\n",
"int32 | object | int32 | float32 | float32 | float64 | float64 | float32 | float32 | float32 |
\n",
"247523 | 2003-04-01T00:00:00 | 10321 | 4.87889 | 190.33 | 273.78399999999999 | 185.994 | 1789.54 | -2573.3201 | -0.46051401 |
\n",
"247528 | 2003-04-01T00:00:00 | -- | -12.1797 | 160.16 | -188.178 | -96.1541 | 2147.6201 | -1722.37 | -46.435699 |
\n",
"247539 | 2003-04-01T00:00:00 | 10325 | 12.3932 | 147.224 | -458.50900000000001 | 298.358 | 3202.8701 | -1377.66 | -2.20612 |
\n",
"247545 | 2003-04-01T00:00:00 | -- | -15.9363 | 129.17799 | -577.27200000000005 | -179.56999999999999 | 766.06 | -2342.4299 | 32.803101 |
\n",
"247549 | 2003-04-01T00:00:00 | 10323 | -7.57478 | 108.752 | 472.27600000000001 | -31.2653 | 2495.7 | -2757.1101 | -4.3433499 |
\n",
"247563 | 2003-04-01T00:00:00 | 10318 | -14.9789 | 77.475998 | 705.50699999999995 | -177.726 | 1379.92 | -1706.51 | -18.2155 |
\n",
"247581 | 2003-04-01T00:00:00 | 10319 | 12.0031 | 62.118 | 799.56500000000005 | 254.83600000000001 | 2780.72 | -1257.1 | -11.4169 |
\n",
"247591 | 2003-04-01T00:00:00 | -- | 9.3859196 | 46.102001 | -251.09299999999999 | 260.08199999999999 | 902.16302 | 0.0 | 51.014099 |
\n",
"247595 | 2003-04-01T00:00:00 | -- | -25.298 | 35.77 | 222.72 | -311.79500000000002 | 80.717697 | -727.44299 | -37.111 |
\n",
"... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
\n",
"268597 | 2003-04-16T00:00:00 | 10334 | -8.70121 | 98.573997 | -17.854299999999999 | -52.354100000000003 | 1070.77 | -2614.1399 | -8.5475798 |
\n",
"268599 | 2003-04-16T00:00:00 | -- | 18.068001 | 59.891998 | -175.74000000000001 | 381.72199999999998 | 840.00299 | -831.59003 | 25.4533 |
\n",
"268602 | 2003-04-16T00:00:00 | 10335 | -21.757099 | 53.382 | -469.75299999999999 | -279.74200000000002 | 1158.71 | -1231.86 | -15.6804 |
\n",
"268607 | 2003-04-16T00:00:00 | -- | -8.2958803 | 43.610001 | -238.803 | -48.5501 | 773.19897 | -896.87598 | -4.4306302 |
\n",
"268610 | 2003-04-16T00:00:00 | -- | -25.8505 | 29.035999 | -698.45000000000005 | -366.28100000000001 | 884.698 | -856.461 | -6.8229899 |
\n",
"268611 | 2003-04-16T00:00:00 | -- | 9.7694597 | 12.628 | 750.81700000000001 | 216.76499999999999 | 766.763 | -150.491 | 41.4245 |
\n",
"268615 | 2003-04-16T00:00:00 | -- | -5.10109 | 12.18 | -492.81999999999999 | -5.5906700000000003 | 0.0 | -517.284 | -44.7612 |
\n",
"268624 | 2003-04-16T00:00:00 | -- | 16.0196 | 11.508 | -420.512 | 341.94900000000001 | 84.997002 | -611.87402 | -33.491901 |
\n",
"268637 | 2003-04-16T00:00:00 | 10332 | 11.5611 | 9.0860004 | 704.524 | 250.667 | 532.935 | -733.26202 | -6.8816299 |
\n",
"268642 | 2003-04-16T00:00:00 | -- | 11.9425 | 8.5539999 | -381.387 | 279.82100000000003 | 0.0 | -640.14502 | -37.867901 |
\n",
"
"
],
"text/plain": [
"\n",
"ID_AR DATE_OBS NOAA_NUMBER ... FEAT_MIN_INT FEAT_MEAN_INT\n",
" ... gauss gauss \n",
"int32 object int32 ... float32 float32 \n",
"------ ------------------- ----------- ... ------------ -------------\n",
"247523 2003-04-01T00:00:00 10321 ... -2573.3201 -0.46051401\n",
"247528 2003-04-01T00:00:00 -- ... -1722.37 -46.435699\n",
"247539 2003-04-01T00:00:00 10325 ... -1377.66 -2.20612\n",
"247545 2003-04-01T00:00:00 -- ... -2342.4299 32.803101\n",
"247549 2003-04-01T00:00:00 10323 ... -2757.1101 -4.3433499\n",
"247563 2003-04-01T00:00:00 10318 ... -1706.51 -18.2155\n",
"247581 2003-04-01T00:00:00 10319 ... -1257.1 -11.4169\n",
"247591 2003-04-01T00:00:00 -- ... 0.0 51.014099\n",
"247595 2003-04-01T00:00:00 -- ... -727.44299 -37.111\n",
" ... ... ... ... ... ...\n",
"268597 2003-04-16T00:00:00 10334 ... -2614.1399 -8.5475798\n",
"268599 2003-04-16T00:00:00 -- ... -831.59003 25.4533\n",
"268602 2003-04-16T00:00:00 10335 ... -1231.86 -15.6804\n",
"268607 2003-04-16T00:00:00 -- ... -896.87598 -4.4306302\n",
"268610 2003-04-16T00:00:00 -- ... -856.461 -6.8229899\n",
"268611 2003-04-16T00:00:00 -- ... -150.491 41.4245\n",
"268615 2003-04-16T00:00:00 -- ... -517.284 -44.7612\n",
"268624 2003-04-16T00:00:00 -- ... -611.87402 -33.491901\n",
"268637 2003-04-16T00:00:00 10332 ... -733.26202 -6.8816299\n",
"268642 2003-04-16T00:00:00 -- ... -640.14502 -37.867901"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# See the table\n",
"asttable"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"FEAT_HG_LAT_DEG\n",
"---------------\n",
" -0.986171\n",
" 0.377107\n",
" -0.172306\n",
" 0.226358\n",
" -0.961276\n",
"[ 0.08504982 -0.21097848 0.21461941 -0.27456847 -0.13182007]\n"
]
}
],
"source": [
"# Different results because quantities are not \n",
"print(np.sin(asttable['FEAT_HG_LAT_DEG'][0:5]))\n",
"print(np.sin(asttable['FEAT_HG_LAT_DEG'][0:5].quantity))\n"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 685188. 576576. 530006.375 465040.78125 391507.1875 ] arcmin2\n"
]
}
],
"source": [
"# And it can also be converted to other units\n",
"print(asttable[0:5]['FEAT_AREA_DEG2'].quantity.to('arcmin2'))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python2",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 0
}