{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Interactive plotting of Data Lab catalog data\n", "In this notebook, we will retrieve the Hydra II SMASH catalog data and make an interactive pair of plots using the Bokeh (http://bokeh.pydata.org/en/latest/) library.\n", "\n", "## Data retrieval\n", "We will use the code example provided in the \"How to use the DataLab query manager service\" notebook to access the data. The columns we will need are RA, Dec, g magnitude, r magnitude, and depthflag.\n", "\n", "## Visualization\n", "Bokeh comes with a number of built-in tools for producing interactive plots. In this example, we will make a pair of plots, a plot of RA vs. Dec on the left and g-r vs. r on the right. We will use Bokeh's linked brushing tools to interactively select a set of points on the RA vs. Dec plot, which will automatically highlight the same points on the color-magnitude diagram on the right. The intended use of this notebook is that the user will start with candidate overdensities and then use Bokeh tools to explore them interactively.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialization\n", "\n", "We need modules from the Bokeh library, NumPy, and Pandas. For the Data Lab query, we need authClient and queryClient from Data Lab's dl library." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Start\n", "Got token anonymous.0.0.anon_access\n" ] } ], "source": [ "print \"Start\"\n", "from bokeh.models import ColumnDataSource\n", "from bokeh.models import LinearAxis,Range1d\n", "from bokeh.plotting import figure, gridplot, output_file, show\n", "from bokeh.io import output_notebook\n", "import numpy as np\n", "import sys\n", "import pandas as pd\n", "\n", "from cStringIO import StringIO\n", "from dl import authClient\n", "from dl import queryClient \n", "\n", "# Get the security token for the datalab demo user\n", "token = authClient.login('anonymous')\n", "print \"Got token\",token" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Query the SMASH DR1 database\n", "\n", "We will query the averaged photometry table from the SMASH catalog and select Field 169, which we know contains the Hydra II dwarf." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Your query is: select ra,dec,gmag,rmag,depthflag from smash_dr1.object where (fieldid = '169' AND (depthflag > 1) and (abs(sharp) < 0.5) and (gmag is not null) and (gmag between 9 and 25) and ((gmag-rmag) between -1.5 and 3.0))\n" ] } ], "source": [ "field = 169 # SMASH Field Number to query\n", "depth = 1 # minimum depth \n", "raname = 'ra'\n", "decname = 'dec'\n", "mags = 'gmag,rmag'\n", "dbase='smash_dr1.object'\n", "fid = 'fieldid'\n", "\n", "# Create the query string.\n", "query = ('select '+raname+','+decname+','+mags+',depthflag from '+dbase+ \\\n", " ' where ('+fid+' = \\'%d\\' AND' \\\n", " ' (depthflag > %d) and ' + \\\n", " ' (abs(sharp) < 0.5) and ' + \\\n", " ' (gmag is not null) and ' + \\\n", " ' (gmag between 9 and 25) and ' + \\\n", " ' ((gmag-rmag) between -1.5 and 3.0))') % \\\n", " (field, depth)\n", " \n", "print \"Your query is:\", query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We issue the query through the Query Manager, which connects directly to the database." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Making query\n", "297788 objects found.\n", "CPU times: user 299 ms, sys: 68 ms, total: 367 ms\n", "Wall time: 12.6 s\n" ] } ], "source": [ "%%time\n", "print \"Making query\"\n", "# Call the Query Manager Service \n", "response = queryClient.query(token, adql = query, fmt = 'csv')\n", "df = pd.read_csv(StringIO(response))\n", "\n", "print len(df), \"objects found.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data munging\n", "\n", "Next we add a g-r color column to the Pandas dataframe." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
radecgmagrmagdepthflagg_r
297783185.509066-30.48836724.609224.497020.1122
297784185.515371-30.49006824.844625.02872-0.1841
297785185.479043-30.48098624.966425.27522-0.3088
297786185.508911-30.47683223.929823.810420.1194
297787185.526843-30.49789524.580124.435520.1446
\n", "
" ], "text/plain": [ " ra dec gmag rmag depthflag g_r\n", "297783 185.509066 -30.488367 24.6092 24.4970 2 0.1122\n", "297784 185.515371 -30.490068 24.8446 25.0287 2 -0.1841\n", "297785 185.479043 -30.480986 24.9664 25.2752 2 -0.3088\n", "297786 185.508911 -30.476832 23.9298 23.8104 2 0.1194\n", "297787 185.526843 -30.497895 24.5801 24.4355 2 0.1446" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"g_r\"]=df[\"gmag\"]-df[\"rmag\"]\n", "df.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting up the visualization with Bokeh\n", "\n", "This function from the Bokeh library triggers embedded plotting output in the notebook. Alternatively, we could have used output_file() to save output to html for separate viewing." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " Loading BokehJS ...\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", "(function(global) {\n", " function now() {\n", " return new Date();\n", " }\n", "\n", " var force = \"1\";\n", "\n", " if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force !== \"\") {\n", " window._bokeh_onload_callbacks = [];\n", " window._bokeh_is_loading = undefined;\n", " }\n", "\n", "\n", " \n", " if (typeof (window._bokeh_timeout) === \"undefined\" || force !== \"\") {\n", " window._bokeh_timeout = Date.now() + 5000;\n", " window._bokeh_failed_load = false;\n", " }\n", "\n", " var NB_LOAD_WARNING = {'data': {'text/html':\n", " \"
\\n\"+\n", " \"

\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"

\\n\"+\n", " \"\\n\"+\n", " \"\\n\"+\n", " \"from bokeh.resources import INLINE\\n\"+\n", " \"output_notebook(resources=INLINE)\\n\"+\n", " \"\\n\"+\n", " \"
\"}};\n", "\n", " function display_loaded() {\n", " if (window.Bokeh !== undefined) {\n", " Bokeh.$(\"#2e54a6be-c51c-4635-bda3-8337e5586b9b\").text(\"BokehJS successfully loaded.\");\n", " } else if (Date.now() < window._bokeh_timeout) {\n", " setTimeout(display_loaded, 100)\n", " }\n", " }\n", "\n", " function run_callbacks() {\n", " window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n", " delete window._bokeh_onload_callbacks\n", " console.info(\"Bokeh: all callbacks have finished\");\n", " }\n", "\n", " function load_libs(js_urls, callback) {\n", " window._bokeh_onload_callbacks.push(callback);\n", " if (window._bokeh_is_loading > 0) {\n", " console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n", " return null;\n", " }\n", " if (js_urls == null || js_urls.length === 0) {\n", " run_callbacks();\n", " return null;\n", " }\n", " console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n", " window._bokeh_is_loading = js_urls.length;\n", " for (var i = 0; i < js_urls.length; i++) {\n", " var url = js_urls[i];\n", " var s = document.createElement('script');\n", " s.src = url;\n", " s.async = false;\n", " s.onreadystatechange = s.onload = function() {\n", " window._bokeh_is_loading--;\n", " if (window._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: all BokehJS libraries loaded\");\n", " run_callbacks()\n", " }\n", " };\n", " s.onerror = function() {\n", " console.warn(\"failed to load library \" + url);\n", " };\n", " console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " document.getElementsByTagName(\"head\")[0].appendChild(s);\n", " }\n", " };var element = document.getElementById(\"2e54a6be-c51c-4635-bda3-8337e5586b9b\");\n", " if (element == null) {\n", " console.log(\"Bokeh: ERROR: autoload.js configured with elementid '2e54a6be-c51c-4635-bda3-8337e5586b9b' but no matching script tag was found. \")\n", " return false;\n", " }\n", "\n", " var js_urls = ['https://cdn.pydata.org/bokeh/release/bokeh-0.12.3.min.js', 'https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.3.min.js'];\n", "\n", " var inline_js = [\n", " function(Bokeh) {\n", " Bokeh.set_log_level(\"info\");\n", " },\n", " \n", " function(Bokeh) {\n", " \n", " Bokeh.$(\"#2e54a6be-c51c-4635-bda3-8337e5586b9b\").text(\"BokehJS is loading...\");\n", " },\n", " function(Bokeh) {\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.3.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.3.min.css\");\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.3.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.3.min.css\");\n", " }\n", " ];\n", "\n", " function run_inline_js() {\n", " \n", " if ((window.Bokeh !== undefined) || (force === \"1\")) {\n", " for (var i = 0; i < inline_js.length; i++) {\n", " inline_js[i](window.Bokeh);\n", " }if (force === \"1\") {\n", " display_loaded();\n", " }} else if (Date.now() < window._bokeh_timeout) {\n", " setTimeout(run_inline_js, 100);\n", " } else if (!window._bokeh_failed_load) {\n", " console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n", " window._bokeh_failed_load = true;\n", " } else if (!force) {\n", " var cell = $(\"#2e54a6be-c51c-4635-bda3-8337e5586b9b\").parents('.cell').data().cell;\n", " cell.output_area.append_execute_result(NB_LOAD_WARNING)\n", " }\n", "\n", " }\n", "\n", " if (window._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n", " run_inline_js();\n", " } else {\n", " load_libs(js_urls, function() {\n", " console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n", " run_inline_js();\n", " });\n", " }\n", "}(this));" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "output_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ColumnDataSource function packages the data to use in the Bokeh plots. The dictionary labels x1, x2, y1, and y2 will be referred to when we set up the figure objects." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "source = ColumnDataSource(data=dict(x1=np.array(df[\"ra\"]), x2=np.array(df[\"g_r\"]), \\\n", " y1=np.array(df[\"dec\"]), y2=np.array(df[\"rmag\"])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will set up the plots. First, we select the Bokeh tools that we want to use. Refer to http://bokeh.pydata.org/en/latest/docs/user_guide/tools.html to see the full list of available tools." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "TOOLS = \"box_select,lasso_select,pan,wheel_zoom,box_zoom,reset,help\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our plot on the left will be RA vs. Dec. We first create an instance of the figure object, specifying the toolset, the size of the plot, the title, whether to use WebGL acceleration, and the \"Level of Detail\" (lod) decimation factor, which determines how the plot behaves when doing interactive panning and zooming. Higher lod_factor means less detail is shown momentarily as the plot is updated interactively. Setting WebGL=True can speed up the interaction significantly, but isn't well handled by all browsers. Safari, for instance, will show only a blank plot with WebGL set to True." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "left = figure(tools=TOOLS, width=400, height=400, title=None,webgl=False,lod_factor=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we run the scatter plot method of the figure object that we created, specifying the dictionary labels that contain the x and y axes, the ColumnDataSource object that contains the data, the radius of the circles used as points, the color of the circles, and the transparency (fill_alpha) of the circles. We turn off connecting lines between the points and suppress axis display. We use a cosine Dec scale factor to set the symbol radius, to avoid having the symbols change for fields at different declination." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [], "source": [ "left.scatter('x1', 'y1', source=source, radius=0.005/np.cos(np.median(df[\"dec\"])/180*np.pi), fill_color='red', fill_alpha=0.1,line_color=None)\n", "left.x_range=Range1d(186.8,183.7)\n", "left.xaxis.axis_label = 'RA'\n", "left.yaxis.axis_label = 'Dec'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our plot on the right will be g-r vs. r. We set the range of the g-r axis to be -2 < g-r < 3." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [], "source": [ "right = figure(tools=TOOLS, width=400, height=400, title=None,webgl=False,lod_factor=100)\n", "right.scatter('x2', 'y2', source=source,radius=0.02, fill_color='red', fill_alpha=0.5,line_color=None)\n", "right.x_range=Range1d(-2,3)\n", "right.y_range=Range1d(25,14)\n", "right.xaxis.axis_label = 'g-r'\n", "right.yaxis.axis_label = 'r'\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we add a plot renderer, in this case gridplot to be able to show two plots side by side." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "p = gridplot([[left, right]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The plots\n", "\n", "Finally, we render the plot. The figures are interactive, with ability to pan, zoom, and select samples of data that are then updated in the other plot. With the large number of points used here, the interaction can be a little slow, depending on browser and hardware. Try Box Select on the clump of points at lower left, where Hydra II is lurking." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show(p)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 0 }