{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting the Sun's Polar Field with Python + Altair" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we will be plotting the average magnetic field around the solar poles to determine when, exactly, the Sun's polar field flips. We will use data from the Helioseismic and Magnetic Imager instrument on NASA's Solar Dynamics Observatory (SDO) satellite, which takes about 1.5 terabytes of data a day (more data than any satellite in the NASA Heliophysics Division). And we'll plot these data using [Altair](https://altair-viz.github.io/index.html), a python library for interactive data visualization, which is built on top of [Vega](https://vega.github.io/vega/about/), which is built on top of [D3](https://d3js.org/)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import json\n", "import numpy as np\n", "import matplotlib.pylab as plt\n", "import pandas as pd\n", "from pandas import Timestamp\n", "import drms\n", "import altair as alt\n", "from datetime import datetime as dt_obj\n", "from matplotlib.dates import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we'll use [drms](https://docs.sunpy.org/projects/drms/en/latest/intro.html), a python library that grabs grabs SDO the data from the [JSOC database](http://jsoc.stanford.edu/ajax/lookdata.html) via a JSON API. The drms series `hmi.meanpf_720s` contains two keywords, `CAPN2` and `CAPS2`, which describe the mean radial component of the magnetic field in the northern and southern polar regions (greater than 60 degrees latitude). \n", "\n", "To start using drms, first establish a connection to JSOC:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "c = drms.Client()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then query for the most recent timestap, or `T_REC`:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "keys = c.query('hmi.meanpf_720s[$]', key='T_REC')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "t_last = keys.T_REC[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now gather all the keyword values for `CAPN2` and `CAPS2` for every 12-hour interval between SDO launch and the most recent `T_REC`:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "keys = c.query('hmi.meanpf_720s[2010.05.01_00_TAI-'+t_last+'@12h][? QUALITY=0 ?]', key='T_REC,CAPN2,CAPS2')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll convert `keys.CAPN2` and `keys.CAPS2` from their native format, as a pandas series, to a numpy array:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "capn2 = np.array(keys.CAPN2)\n", "caps2 = np.array(keys.CAPS2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The error in the value of `CAPN2` and `CAPS2` at any point in time is defined by the standard deviation in this quantity over a 30 day period (beginning 15 days before the time of interest and ending 15 days after the time of interest):" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "err_capn2 = np.zeros(keys.index.stop)\n", "err_caps2 = np.zeros(keys.index.stop)\n", "\n", "for i in range(0,keys.index.stop):\n", " err_capn2[i] = np.std(capn2[i-30:i+30])\n", " \n", "for i in range(0,keys.index.stop):\n", " err_caps2[i] = np.std(caps2[i-30:i+30])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll smooth the plot by plotting the running average of `CAPN2` and `CAPS2` over a 30 day window:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "avg_capn2 = np.zeros(keys.index.stop)\n", "avg_caps2 = np.zeros(keys.index.stop)\n", "\n", "for i in range(0,keys.index.stop):\n", " avg_capn2[i] = np.mean(capn2[i-30:i+30])\n", " \n", "for i in range(0,keys.index.stop):\n", " avg_caps2[i] = np.mean(caps2[i-30:i+30])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And convert these `T_REC`s into [Pandas Timestamps](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html):" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "def parse_tai_string(tstr):\n", " year = int(tstr[:4])\n", " month = int(tstr[5:7])\n", " day = int(tstr[8:10])\n", " hour = int(tstr[11:13])\n", " minute = int(tstr[14:16])\n", " return pd.Timestamp(year=year, month=month, day=day, hour=hour, minute=minute)\n", "\n", "x = np.array([parse_tai_string(keys.T_REC[i]) for i in range(keys.index.stop)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Altair likes all the data to be in a Pandas Dataframe, so we can construct one to hold all our data:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "data = {'time': list(x), 'Average CAPN2': list(avg_capn2), 'CAPN2 plus error': list(avg_capn2+err_capn2), \n", " 'CAPN2 minus error': list(avg_capn2-err_capn2), 'Average CAPS2': list(avg_caps2), \n", " 'CAPS2 plus error': list(avg_caps2+err_caps2), 'CAPS2 minus error': list(avg_caps2-err_caps2)}\n", "df = pd.DataFrame(data, index=np.arange(keys.index.stop))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 7095 entries, 0 to 7094\n", "Data columns (total 7 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 time 7095 non-null datetime64[ns]\n", " 1 Average CAPN2 7065 non-null float64 \n", " 2 CAPN2 plus error 7065 non-null float64 \n", " 3 CAPN2 minus error 7065 non-null float64 \n", " 4 Average CAPS2 7065 non-null float64 \n", " 5 CAPS2 plus error 7065 non-null float64 \n", " 6 CAPS2 minus error 7065 non-null float64 \n", "dtypes: datetime64[ns](1), float64(6)\n", "memory usage: 443.4 KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Altair likes to plot a maximum of 5000 points, otherwise the interactive elements can get slow. However, since we have more points, we can disable this default maximum value:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DataTransformerRegistry.enable('default')" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alt.data_transformers.disable_max_rows()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now for some interactive plotting with! Hover over any of the shaded regions to activate a tooltip that displays detailed information about that data point. You can also zoom in and out of the plot for more detail or drag the plot around the axes. (Note: If you're using Jupyter Lab to view this notebook, the interactive elements will show up by default. If you're using Jupyter Notebook, you must install [ipyvega](https://github.com/vega/ipyvega) to see the interactive elements in the notebook)." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create the base plot\n", "base = alt.Chart(df).encode(\n", " alt.X('time', axis=alt.Axis(title='Time', grid=False))\n", ").properties(\n", " title='Mean Polar Field',\n", " width=800,\n", " height=300\n", ")\n", "\n", "# Plot CAPN2\n", "line= base.mark_line(color='#6394ED').encode(\n", " y=('Average CAPN2'),\n", ")\n", "\n", "# Plot CAPS2\n", "line2= base.mark_line(color='#FF4500').encode(\n", " y=('Average CAPS2')\n", ")\n", "\n", "# Plot the error bars for CAPN2\n", "area = base.mark_area(opacity=0.3, color='#6394ED').encode(\n", " alt.Y('CAPN2 plus error', axis=alt.Axis(title='Mean Radial Field Strength (Gauss)', grid=False)),\n", " alt.Y2('CAPN2 minus error'),\n", " tooltip=['Average CAPN2', 'time']\n", ").interactive()\n", "\n", "\n", "# Plot the error bars for CAPS2\n", "area2 = base.mark_area(opacity=0.3, color='#FF4500').encode(\n", " alt.Y('CAPS2 plus error', axis=alt.Axis(title='Mean Radial Field Strength (Gauss)')),\n", " alt.Y2('CAPS2 minus error'),\n", " tooltip=['Average CAPS2','time']\n", ").interactive()\n", "\n", "\n", "# Create a bunch of information for the legend\n", "source2 = alt.pd.DataFrame([{\"Location\": \"North Pole\", \"Colors\": \"#6394ED\"}, {\"Location\": \"South Pole\", \"Colors\": \"#FF4500\"}])\n", "\n", "range_=['#6394ED', '#FF4500']\n", "\n", "legend = alt.Chart(source2).mark_line().encode(\n", " color=alt.Color('Location:N', scale=alt.Scale(range=range_, clamp=True), \n", " legend=alt.Legend(title=\"\", symbolSize=400, symbolStrokeWidth=4))\n", ")\n", "\n", "# Bind everything together\n", "alt.layer(line, line2, area, area2, legend).resolve_scale(\n", " y = 'shared',\n", ").configure_view(\n", " strokeWidth=0\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To see a daily-updated and interactive version of this plot, check out the HMI Polar Field site. To create your own html file, embed the javascript from the 'View Source' option under the three dots to the right of the plot. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }