{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# A demonstration of Star/Galaxy separation in the SMASH catalog\n", "In this notebook, we will query the SMASH DR1 catalog and apply constraints on the \"sharp\" and \"prob\" parameters to demonstrate ways to create samples of likely stars and galaxies.\n", "\n", "## Visualization\n", "We will use Datashader and Bokeh to make fast interactive plots.\n", "\n", "## Known issues\n", "The color-magnitude diagram is plotted upside-down from the usual sense due to a current limitation in the notebook's use of Datashader." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialization\n", "\n", "We need modules from the Bokeh library, Datashader, NumPy, and Pandas, as well as the Data Lab modules to connect to and query the database." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Start\n", "Got token anonymous.0.0.anon_access\n" ] } ], "source": [ "print \"Start\"\n", "from cStringIO import StringIO\n", "from dl import authClient\n", "from dl import queryClient\n", "\n", "import pandas as pd\n", "import datashader as ds\n", "import datashader.glyphs\n", "import datashader.transfer_functions as tf\n", "import bokeh.plotting as bp\n", "from datashader.bokeh_ext import InteractiveImage\n", "\n", "# Get the security token for the datalab demo user\n", "token = authClient.login('anonymous')\n", "print \"Got token\",token" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Querying the SMASH DR1 catalog\n", "\n", "We will query the SMASH catalog over a range of fields to sample a variety of field properties. We set a constraint on the depthflag parameter to insist on detection in the deep exposures. To make the query go faster, one can restrict the range of fieldid in the query. The default field range will return roughly 6 million objects." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Your query is: select ra,dec,gmag,rmag,sharp,chi,prob from smash_dr1.object where (depthflag > 1 and (gmag is not null) and (fieldid>55 and fieldid<70))\n", "Making query\n", "6392274 objects found.\n", "CPU times: user 6.57 s, sys: 1.33 s, total: 7.9 s\n", "Wall time: 46.5 s\n" ] } ], "source": [ "%%time\n", "depth = 1 # minimum depth \n", "raname = 'ra'\n", "decname = 'dec'\n", "mags = 'gmag,rmag'\n", "dbase='smash_dr1.object'\n", "\n", "# Create the query string.\n", "query = ('select '+raname+','+decname+','+mags+',sharp,chi,prob from '+dbase+ \\\n", " ' where (depthflag > %d and ' + \\\n", " ' (gmag is not null) and ' + \\\n", " ' (fieldid>55 and fieldid<70))') % \\\n", " (depth)\n", " \n", "print \"Your query is:\", query\n", "print \"Making query\"\n", "\n", "# Call the Query Manager Service \n", "response = queryClient.query(token, adql = query, fmt = 'csv')\n", "df = pd.read_csv(StringIO(response))\n", "\n", "print len(df), \"objects found.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Making cuts on parameters to separate stars from galaxies\n", "\n", "We will first make a single cut on the sharp parameter, and classify objects with sharp>0.7 as galaxies. Pandas allows us to add a Class column to the dataframe and specify that it should be considered a category. We will also add a g_r column to the Pandas dataframe." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | ra | \n", "dec | \n", "gmag | \n", "rmag | \n", "sharp | \n", "chi | \n", "prob | \n", "Class | \n", "g_r | \n", "
---|---|---|---|---|---|---|---|---|---|
6392269 | \n", "107.060693 | \n", "-54.163880 | \n", "99.99 | \n", "99.99 | \n", "-0.846 | \n", "0.78 | \n", "0.73 | \n", "Galaxy | \n", "0.0 | \n", "
6392270 | \n", "107.059618 | \n", "-54.166219 | \n", "99.99 | \n", "99.99 | \n", "0.965 | \n", "0.71 | \n", "0.92 | \n", "Galaxy | \n", "0.0 | \n", "
6392271 | \n", "107.058524 | \n", "-54.166433 | \n", "99.99 | \n", "99.99 | \n", "0.778 | \n", "0.83 | \n", "0.56 | \n", "Galaxy | \n", "0.0 | \n", "
6392272 | \n", "107.058557 | \n", "-54.166684 | \n", "99.99 | \n", "99.99 | \n", "3.260 | \n", "0.97 | \n", "0.96 | \n", "Galaxy | \n", "0.0 | \n", "
6392273 | \n", "107.059356 | \n", "-54.160246 | \n", "99.99 | \n", "99.99 | \n", "0.335 | \n", "0.59 | \n", "0.72 | \n", "Star | \n", "0.0 | \n", "
\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"
\\n\"+\n", " \"\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"
\\n\"+\n",
" \"\n", " | ra | \n", "dec | \n", "gmag | \n", "rmag | \n", "sharp | \n", "chi | \n", "prob | \n", "Class | \n", "g_r | \n", "
---|---|---|---|---|---|---|---|---|---|
6392269 | \n", "107.060693 | \n", "-54.163880 | \n", "99.99 | \n", "99.99 | \n", "-0.846 | \n", "0.78 | \n", "0.73 | \n", "Star | \n", "0.0 | \n", "
6392270 | \n", "107.059618 | \n", "-54.166219 | \n", "99.99 | \n", "99.99 | \n", "0.965 | \n", "0.71 | \n", "0.92 | \n", "Star | \n", "0.0 | \n", "
6392271 | \n", "107.058524 | \n", "-54.166433 | \n", "99.99 | \n", "99.99 | \n", "0.778 | \n", "0.83 | \n", "0.56 | \n", "Star | \n", "0.0 | \n", "
6392272 | \n", "107.058557 | \n", "-54.166684 | \n", "99.99 | \n", "99.99 | \n", "3.260 | \n", "0.97 | \n", "0.96 | \n", "Star | \n", "0.0 | \n", "
6392273 | \n", "107.059356 | \n", "-54.160246 | \n", "99.99 | \n", "99.99 | \n", "0.335 | \n", "0.59 | \n", "0.72 | \n", "Star | \n", "0.0 | \n", "
\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"
\\n\"+\n", " \"\\n\"+\n",
" \"from bokeh.resources import INLINE\\n\"+\n",
" \"output_notebook(resources=INLINE)\\n\"+\n",
" \"
\\n\"+\n",
" \"