{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt, seaborn as sn, mpld3\n",
    "import pandas as pd, imp, glob, os, numpy as np\n",
    "from sqlalchemy import create_engine\n",
    "sn.set_context('notebook')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TOC trends 2015: database clean-up (part 3)\n",
    "\n",
    "Back from holiday and trying to remember where I'd got to with the ICPW database. John has sent detailed replies to my questions regarding the US sites (see e-mails received 25/08/2016 at 01:45 and 03:33), so this seems like a good place to start.\n",
    "\n",
    "## 1. Checking US sites\n",
    "\n",
    "My initial clean-up of RESA2 ended with 72 sites linked to the `ICPW_TOCTRENDS_2015_US_LTM` project (see section2 of [this notebook](http://nbviewer.jupyter.org/url/www.googledrive.com/host/0BximeC_RweaeUy1jd2k3Nm1kdms/toc_trends_2015_data_cleaning.ipynb) for details) and, following advice from Tore, I also shifted 19 US sites into a new project called `ICPW_TOCTRENDS_2015_US_LTM_EXCLUDED`. In addition, manually checking RESA2 identifies 95 sites associated with the `ICPWaters US` project. \n",
    "\n",
    "Based on the information in John's emails, some of this needs revisiting.\n",
    "\n",
    "### 1.1. Connect to db and check site numbers are as expected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Import custom functions and connect to db\n",
    "resa2_basic_path = (r'C:\\Data\\James_Work\\Staff\\Heleen_d_W\\ICP_Waters\\Upload_Template'\n",
    "                    r'\\useful_resa2_code.py')\n",
    "\n",
    "resa2_basic = imp.load_source('useful_resa2_code', resa2_basic_path)\n",
    "\n",
    "engine, conn = resa2_basic.connect_to_resa2()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1679 95\n",
      "3870 72\n",
      "4150 19\n"
     ]
    }
   ],
   "source": [
    "# Check numbers of sites\n",
    "proj_codes = [1679,   # ICPWaters US\n",
    "              3870,   # ICPW_TOCTRENDS_2015_US_LTM\n",
    "              4150]   # ICPW_TOCTRENDS_2015_US_LTM_EXCLUDED\n",
    "\n",
    "for code in proj_codes:\n",
    "    # Read stations list\n",
    "    sql = ('SELECT * '\n",
    "           'FROM resa2.projects_stations '\n",
    "           'WHERE project_id = %s' % code)\n",
    "    df = pd.read_sql_query(sql, engine)\n",
    "    \n",
    "    print code, len(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.2. Check sites against John's spreadsheet\n",
    "\n",
    "Based on John's e-mail, four of the sites that I previously excluded should actually be included in the trends project, so I need to move these back across:\n",
    "\n",
    "|  Site ID |           Site name           |\n",
    "|:--------:|:-----------------------------:|\n",
    "| 040874O  | Brook Trout Lake, Adirondacks |\n",
    "| 1A1-109O | Moss Lake, Adirondacks        |\n",
    "| 1A1-110O | Lake Rondaxe, Adirondacks     |\n",
    "| ME-4474  | Duck Pond, Maine              |\n",
    "\n",
    "John has sent a spreadsheet (*U.S.Site.Reconciliation.August.2016.xlsx*) listing which sites should be included in which projects. First, compare the number of sites in John's spreadsheet with the numbers in RESA2."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "John, core ICPW sites:  95\n",
      "John, DOC trends sites: 76\n"
     ]
    }
   ],
   "source": [
    "# Read John's spreadsheet\n",
    "in_xls = (r'C:\\Data\\James_Work\\Staff\\Heleen_d_W\\ICP_Waters\\Call_for_Data_2016'\n",
    "           '\\Replies\\usa\\U.S.Site.Reconciliation.August.2016.xlsx')\n",
    "\n",
    "j_df = pd.read_excel(in_xls, sheetname='Sheet1')\n",
    "\n",
    "print 'John, core ICPW sites: ', len(j_df[j_df['INCLUDE IN ICP WATERS DATABASE']=='YES'])\n",
    "print 'John, DOC trends sites:', len(j_df[j_df['INCLUDE IN ICP DOC ANALYSIS']=='YES'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This looks promising. With a bit of luck, I can just move the four sites listed above from the `EXCLUDED` project back into `ICPW_TOCTRENDS_2015_US_LTM` and everything will be OK...\n",
    "\n",
    "...actually, no - it's not that simple. Of the 95 `ICPWaters US` sites in the spreadsheet, John only has RESA2 codes for 91 of them (four are marked NA). This implies the 95 sites currently in RESA2 are not the same as the 95 in the spreadsheet, so I need to work out where the discrepancies are."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>station_id</th>\n",
       "      <th>station_code</th>\n",
       "      <th>station_name</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>Station Code</th>\n",
       "      <th>NFC_SITEID</th>\n",
       "      <th>NFC_SITENAME</th>\n",
       "      <th>Latitude</th>\n",
       "      <th>Longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>60</th>\n",
       "      <td>23647.0</td>\n",
       "      <td>US126</td>\n",
       "      <td>Little, Winhall, Vermont</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>69</th>\n",
       "      <td>23643.0</td>\n",
       "      <td>US122</td>\n",
       "      <td>Cow Mountain, Vermont</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>23645.0</td>\n",
       "      <td>US124</td>\n",
       "      <td>Kettle, Vermont</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71</th>\n",
       "      <td>23620.0</td>\n",
       "      <td>1C3-075E</td>\n",
       "      <td>South � Marlboro, Vermont</td>\n",
       "      <td>42.8439</td>\n",
       "      <td>-72.7125</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>95</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1A1-029O</td>\n",
       "      <td>MIDDLE POND</td>\n",
       "      <td>44.3369</td>\n",
       "      <td>-74.3719</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>96</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1A2-076O</td>\n",
       "      <td>BARNES LAKE</td>\n",
       "      <td>43.5667</td>\n",
       "      <td>-75.2278</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>97</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1C1-106E</td>\n",
       "      <td>SUCKER</td>\n",
       "      <td>42.8250</td>\n",
       "      <td>-73.1292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>98</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ME-9997O</td>\n",
       "      <td>TUNK LAKE</td>\n",
       "      <td>44.5986</td>\n",
       "      <td>-68.0592</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    station_id station_code               station_name  latitude  longitude  \\\n",
       "60     23647.0        US126   Little, Winhall, Vermont       NaN        NaN   \n",
       "69     23643.0        US122      Cow Mountain, Vermont       NaN        NaN   \n",
       "70     23645.0        US124            Kettle, Vermont       NaN        NaN   \n",
       "71     23620.0     1C3-075E  South � Marlboro, Vermont   42.8439   -72.7125   \n",
       "95         NaN          NaN                        NaN       NaN        NaN   \n",
       "96         NaN          NaN                        NaN       NaN        NaN   \n",
       "97         NaN          NaN                        NaN       NaN        NaN   \n",
       "98         NaN          NaN                        NaN       NaN        NaN   \n",
       "\n",
       "   Station Code NFC_SITEID NFC_SITENAME  Latitude  Longitude  \n",
       "60          NaN        NaN          NaN       NaN        NaN  \n",
       "69          NaN        NaN          NaN       NaN        NaN  \n",
       "70          NaN        NaN          NaN       NaN        NaN  \n",
       "71          NaN        NaN          NaN       NaN        NaN  \n",
       "95          NaN   1A1-029O  MIDDLE POND   44.3369   -74.3719  \n",
       "96          NaN   1A2-076O  BARNES LAKE   43.5667   -75.2278  \n",
       "97          NaN   1C1-106E       SUCKER   42.8250   -73.1292  \n",
       "98          NaN   ME-9997O    TUNK LAKE   44.5986   -68.0592  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Get the 95 'ICPWaters US' sites from RESA2\n",
    "sql = ('SELECT * FROM resa2.stations '\n",
    "       'WHERE station_id IN (SELECT station_id '\n",
    "                            'FROM resa2.projects_stations '\n",
    "                            'WHERE project_id = 1679)')\n",
    "core_df = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Outer join based on RESA2 station code\n",
    "df = core_df.merge(j_df[j_df['INCLUDE IN ICP WATERS DATABASE']=='YES'], \n",
    "                   how='outer', \n",
    "                   left_on='station_code',\n",
    "                   right_on='Station Code')\n",
    "\n",
    "# Cols of interest\n",
    "cols = ['station_id', 'station_code', 'station_name', 'latitude', 'longitude',\n",
    "        'Station Code', 'NFC_SITEID', 'NFC_SITENAME', 'Latitude', 'Longitude']\n",
    "df = df[cols]\n",
    "\n",
    "# Get entries where no match was found\n",
    "df[pd.isnull(df['station_code']) | pd.isnull(df['Station Code'])]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, there are four stations (rows 60, 69, 70 and 71 in the table above) currently included in the `ICPWaters US` project that should be removed. There are also four stations (rows 95 to 98 above) that are present in John's spreadsheet but not associated with the project. Of these, two of them (Middle Pond and Tunk) are already in the database but associated with the `ICPW_TOCTRENDS_2015_US_LTM` project only, while one of them (Sucker) has been shifted to the `ICPW_TOCTRENDS_2015_US_LTM_EXCLUDED` project. The remaining site (Barnes Lake) does not appear anywhere in the database as far as I can tell.\n",
    "\n",
    "Next steps:\n",
    "\n",
    " * Remove four sites (Little, Cow Mountain, Kettle and South Marlboro) from the `ICPWaters US` project (and transfer them to `EXCLUDED`)\n",
    " * Add Middle Pond, Sucker and Tunk Lake to the `ICPWaters US` project\n",
    " * Create a new site for Barnes Lake and link it to the `ICPWaters US` project\n",
    " * Move four sites (Brook Trout Lake, Moss Lake, Lake Rondaxe and Duck Pond) from `ICPW_TOCTRENDS_2015_US_LTM_EXCLUDED` to `ICPW_TOCTRENDS_2015_US_LTM`\n",
    " \n",
    "These changes are most easily made manually using Access. As a quick check, let's make sure there are now 95 sites associated with `ICPWaters US` and 76 sites associated with `ICPW_TOCTRENDS_2015_US_LTM`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1679 95\n",
      "3870 76\n"
     ]
    }
   ],
   "source": [
    "# Check numbers of sites\n",
    "proj_codes = [1679,   # ICPWaters US\n",
    "              3870]   # ICPW_TOCTRENDS_2015_US_LTM\n",
    "              \n",
    "for code in proj_codes:\n",
    "    # Read stations list\n",
    "    sql = ('SELECT * '\n",
    "           'FROM resa2.projects_stations '\n",
    "           'WHERE project_id = %s' % code)\n",
    "    df = pd.read_sql_query(sql, engine)\n",
    "    \n",
    "    print code, len(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1.3. The Mystery of Mud Pond\n",
    "\n",
    "I wish this section was as exciting as the title makes it sound!\n",
    "\n",
    "Section 2 of the [previous notebook](http://nbviewer.jupyter.org/url/www.googledrive.com/host/0BximeC_RweaeUy1jd2k3Nm1kdms/toc_trends_2015_data_cleaning.ipynb) identified two Mud Ponds. The one with RESA2 `station_id=37063` is correct, but the other (`station_id=23709`) has incorrect co-ordinates and is several hundred kilometres from the true location. The incorrect site has become associated with lots of different projects and this means that deleting it is not straightforward. Let's get a list of projects using each of these sites."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Projects using station_id 23709:\n",
      "\n",
      "    project_id                   project_name\n",
      "0         1559                     ICP Waters\n",
      "1         1679                   ICPWaters US\n",
      "2         1901                  ICPW-Nitrogen\n",
      "3         1961                   TOCICP_MAINE\n",
      "4         2240                    ICPW_TRENDS\n",
      "5         3245      ICPW Regional Trend ESO4*\n",
      "6         3246  ICPW Regional Trend ECA*+EMG*\n",
      "7         3247    ICPW Regional Trend TOC/DOC\n",
      "8         3248        ICPW Regional Trend ANC\n",
      "9         3249        ICPW Regional Trend Alk\n",
      "10        3250         ICPW Regional Trend H+\n",
      "11        3251       ICPW Regional Trend ENO3\n",
      "12        3765             ICPW_TOCTRENDS2006\n",
      "13        3785          ICPW_TOCTRENDS2006_AM\n",
      "14        3789                      TOCICP-US\n",
      "15        3870     ICPW_TOCTRENDS_2015_US_LTM\n",
      "\n",
      "\n",
      "Projects using station_id 37063:\n",
      "\n",
      "   project_id                project_name\n",
      "0        3870  ICPW_TOCTRENDS_2015_US_LTM\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Find projects using the different versions of Mud Pond\n",
    "for site_id in [23709, 37063]:\n",
    "    sql = ('SELECT project_id, project_name '\n",
    "           'FROM resa2.projects '\n",
    "           'WHERE project_id IN (SELECT project_id '\n",
    "                                'FROM resa2.projects_stations '\n",
    "                                'WHERE station_id = %s)' % site_id)\n",
    "    df = pd.read_sql_query(sql, engine)\n",
    "    \n",
    "    print 'Projects using station_id %s:\\n' % site_id\n",
    "    print df\n",
    "    print '\\n'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, the incorrect site has been used in 16 different projects, most (all?) of which are linked to ICP Waters in some way. The only project using the correct site is `ICPW_TOCTRENDS_2015_US_LTM`. Before I can delete the incorrect site, I need to switch all these other projects to use `station_id=37063` instead.\n",
    "\n",
    "**NB:** Based on the above output, `ICPW_TOCTRENDS_2015_US_LTM` currently uses *both* Mud Ponds, so when I delete the incorrect one, there will only be 75 sites associated with this project. This means I'm still missing a site somewhere compared to John's spreadsheet. **Come back to this**.\n",
    "\n",
    "In Access:\n",
    "\n",
    " * Delete the link between `station_id=23709` and `ICPW_TOCTRENDS_2015_US_LTM` in the `projects_stations` table.\n",
    " * For the other 15 projects, switch the `station_id` from `23709` to `37063`.\n",
    " * Delete all references to site `23709` from the `stations_par_values` table.\n",
    " \n",
    "The tricky part is how to deal with the water chemistry results associated with each of these sites."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of samples associated with site 23709: 88\n",
      "Number of samples associated with site 37063: 107\n",
      "Number of samples with common dates:          63\n"
     ]
    }
   ],
   "source": [
    "# Read sample info for incorrect site\n",
    "sql = ('SELECT water_sample_id, station_id, sample_date '\n",
    "       'FROM resa2.water_samples '\n",
    "       'WHERE station_id = 23709')\n",
    "df_23709 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Read sample info for correct site\n",
    "sql = ('SELECT water_sample_id, station_id, sample_date '\n",
    "       'FROM resa2.water_samples '\n",
    "       'WHERE station_id = 37063')\n",
    "df_37063 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Outer join\n",
    "df = df_23709.merge(df_37063, how='outer',\n",
    "                    on='sample_date',\n",
    "                    suffixes=['_23709', '_37063'])\n",
    "\n",
    "print 'Number of samples associated with site 23709:', len(df_23709)\n",
    "print 'Number of samples associated with site 37063:', len(df_37063)\n",
    "print 'Number of samples with common dates:         ', len(df.dropna(how='any')) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It looks as though there is at least some overlap here, but it's not easy to see what's going on without some plots. Let's pick a commonly measured parameter ($Cl^-$ in $mg/l$) and plot the values for each site to see whether they're actually the same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " Total number of samples in merged series: 132\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "<style>\n",
       "\n",
       "</style>\n",
       "\n",
       "<div id=\"fig_el39245296682488465193167\"></div>\n",
       "<script>\n",
       "function mpld3_load_lib(url, callback){\n",
       "  var s = document.createElement('script');\n",
       "  s.src = url;\n",
       "  s.async = true;\n",
       "  s.onreadystatechange = s.onload = callback;\n",
       "  s.onerror = function(){console.warn(\"failed to load library \" + url);};\n",
       "  document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
       "}\n",
       "\n",
       "if(typeof(mpld3) !== \"undefined\" && mpld3._mpld3IsLoaded){\n",
       "   // already loaded: just create the figure\n",
       "   !function(mpld3){\n",
       "       \n",
       "       mpld3.draw_figure(\"fig_el39245296682488465193167\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.063289797771247491], \"rotation\": -0.0, \"id\": \"el3924673630808\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"Chloride concentration in mg/l at two Mud Ponds\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 20.0, \"position\": [0.5, 1.0101934885413466], \"rotation\": -0.0, \"id\": \"el3924676517984\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23079\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.96688708956646896], \"rotation\": -0.0, \"id\": \"el3924677149328\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93825931409113061], \"rotation\": -0.0, \"id\": \"el3924677204936\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el3924677148936\"}], \"sharey\": [], \"sharex\": [], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924676467656\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924677147424\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677150112\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677205720\"}], \"markers\": [], \"id\": \"el3924673628288\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965159, 0.95538773148148148, 0.8515784876144793]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9740225315454116, 0.9453947560700733], [0.919710947295734, 0.9740225315454116, 0.9453947560700733]], \"data03\": [[0.8906360734382857, 0.9264348673831684], [0.9939427346130317, 0.9264348673831684], [0.9939427346130317, 0.9898065114586534], [0.8906360734382857, 0.9898065114586534]], \"data01\": [[723666.0, 2.02056008507621, NaN], [723732.0, 2.26869904289259, NaN], [723833.0, 2.76497695852535, NaN], [723870.0, 3.2258064516129, NaN], [724021.0, 3.01311591634172, NaN], [724098.0, 2.72952853598015, NaN], [724208.0, 2.65863169088976, NaN], [724393.0, 3.04856433888692, NaN], [724460.0, 2.48138957816377, NaN], [724587.0, 2.87132222616094, NaN], [724743.0, 3.54484225451967, NaN], [724832.0, 3.70436015597306, NaN], [724953.0, 3.93477490251684, NaN], [725124.0, 3.54484225451967, NaN], [725196.0, 3.58029067706487, NaN], [725320.0, 3.40304856433889, NaN], [725484.0, 3.2258064516129, NaN], [725565.0, 3.19035802906771, NaN], [725676.0, 3.58029067706487, NaN], [725847.0, 3.10173697270471, NaN], [725937.0, 3.11946118397731, NaN], [726060.0, 3.15490960652251, NaN], [726218.0, 3.10173697270471, NaN], [726292.0, 2.37504431052818, NaN], [726420.0, 3.2967032967033, NaN], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el3924529668248\"});\n",
       "   }(mpld3);\n",
       "}else if(typeof define === \"function\" && define.amd){\n",
       "   // require.js is available: use it to load d3/mpld3\n",
       "   require.config({paths: {d3: \"https://mpld3.github.io/js/d3.v3.min\"}});\n",
       "   require([\"d3\"], function(d3){\n",
       "      window.d3 = d3;\n",
       "      mpld3_load_lib(\"https://mpld3.github.io/js/mpld3.v0.2.js\", function(){\n",
       "         \n",
       "         mpld3.draw_figure(\"fig_el39245296682488465193167\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.063289797771247491], \"rotation\": -0.0, \"id\": \"el3924673630808\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"Chloride concentration in mg/l at two Mud Ponds\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 20.0, \"position\": [0.5, 1.0101934885413466], \"rotation\": -0.0, \"id\": \"el3924676517984\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23079\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.96688708956646896], \"rotation\": -0.0, \"id\": \"el3924677149328\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93825931409113061], \"rotation\": -0.0, \"id\": \"el3924677204936\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el3924677148936\"}], \"sharey\": [], \"sharex\": [], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924676467656\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924677147424\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677150112\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677205720\"}], \"markers\": [], \"id\": \"el3924673628288\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965159, 0.95538773148148148, 0.8515784876144793]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9740225315454116, 0.9453947560700733], [0.919710947295734, 0.9740225315454116, 0.9453947560700733]], \"data03\": [[0.8906360734382857, 0.9264348673831684], [0.9939427346130317, 0.9264348673831684], [0.9939427346130317, 0.9898065114586534], [0.8906360734382857, 0.9898065114586534]], \"data01\": [[723666.0, 2.02056008507621, NaN], [723732.0, 2.26869904289259, NaN], [723833.0, 2.76497695852535, NaN], [723870.0, 3.2258064516129, NaN], [724021.0, 3.01311591634172, NaN], [724098.0, 2.72952853598015, NaN], [724208.0, 2.65863169088976, NaN], [724393.0, 3.04856433888692, NaN], [724460.0, 2.48138957816377, NaN], [724587.0, 2.87132222616094, NaN], [724743.0, 3.54484225451967, NaN], [724832.0, 3.70436015597306, NaN], [724953.0, 3.93477490251684, NaN], [725124.0, 3.54484225451967, NaN], [725196.0, 3.58029067706487, NaN], [725320.0, 3.40304856433889, NaN], [725484.0, 3.2258064516129, NaN], [725565.0, 3.19035802906771, NaN], [725676.0, 3.58029067706487, NaN], [725847.0, 3.10173697270471, NaN], [725937.0, 3.11946118397731, NaN], [726060.0, 3.15490960652251, NaN], [726218.0, 3.10173697270471, NaN], [726292.0, 2.37504431052818, NaN], [726420.0, 3.2967032967033, NaN], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el3924529668248\"});\n",
       "      });\n",
       "    });\n",
       "}else{\n",
       "    // require.js not available: dynamically load d3 & mpld3\n",
       "    mpld3_load_lib(\"https://mpld3.github.io/js/d3.v3.min.js\", function(){\n",
       "         mpld3_load_lib(\"https://mpld3.github.io/js/mpld3.v0.2.js\", function(){\n",
       "                 \n",
       "                 mpld3.draw_figure(\"fig_el39245296682488465193167\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.063289797771247491], \"rotation\": -0.0, \"id\": \"el3924673630808\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"Chloride concentration in mg/l at two Mud Ponds\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 20.0, \"position\": [0.5, 1.0101934885413466], \"rotation\": -0.0, \"id\": \"el3924676517984\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23079\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.96688708956646896], \"rotation\": -0.0, \"id\": \"el3924677149328\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93825931409113061], \"rotation\": -0.0, \"id\": \"el3924677204936\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el3924677148936\"}], \"sharey\": [], \"sharex\": [], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924676467656\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el3924677147424\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677150112\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el3924677205720\"}], \"markers\": [], \"id\": \"el3924673628288\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965159, 0.95538773148148148, 0.8515784876144793]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9740225315454116, 0.9453947560700733], [0.919710947295734, 0.9740225315454116, 0.9453947560700733]], \"data03\": [[0.8906360734382857, 0.9264348673831684], [0.9939427346130317, 0.9264348673831684], [0.9939427346130317, 0.9898065114586534], [0.8906360734382857, 0.9898065114586534]], \"data01\": [[723666.0, 2.02056008507621, NaN], [723732.0, 2.26869904289259, NaN], [723833.0, 2.76497695852535, NaN], [723870.0, 3.2258064516129, NaN], [724021.0, 3.01311591634172, NaN], [724098.0, 2.72952853598015, NaN], [724208.0, 2.65863169088976, NaN], [724393.0, 3.04856433888692, NaN], [724460.0, 2.48138957816377, NaN], [724587.0, 2.87132222616094, NaN], [724743.0, 3.54484225451967, NaN], [724832.0, 3.70436015597306, NaN], [724953.0, 3.93477490251684, NaN], [725124.0, 3.54484225451967, NaN], [725196.0, 3.58029067706487, NaN], [725320.0, 3.40304856433889, NaN], [725484.0, 3.2258064516129, NaN], [725565.0, 3.19035802906771, NaN], [725676.0, 3.58029067706487, NaN], [725847.0, 3.10173697270471, NaN], [725937.0, 3.11946118397731, NaN], [726060.0, 3.15490960652251, NaN], [726218.0, 3.10173697270471, NaN], [726292.0, 2.37504431052818, NaN], [726420.0, 3.2967032967033, NaN], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el3924529668248\"});\n",
       "            })\n",
       "         });\n",
       "}\n",
       "</script>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Site 23709\n",
    "sql = ('SELECT sample_id, value '\n",
    "       'FROM resa2.water_chemistry_values2 '\n",
    "       'WHERE method_id = 10253 '\n",
    "       'AND sample_id IN (SELECT water_sample_id '\n",
    "                         'FROM resa2.water_samples '\n",
    "                         'WHERE station_id = 23709)')\n",
    "df_23709 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Join dates\n",
    "df_23709 = df_23709.merge(df, how='left',\n",
    "                          left_on='sample_id',\n",
    "                          right_on='water_sample_id_23709')\n",
    "\n",
    "# Tidy df\n",
    "df_23709.sort_values(by='sample_date', inplace=True, ascending='True')\n",
    "df_23709.index = df_23709['sample_date']\n",
    "df_23709 = df_23709[['value']]\n",
    "df_23709.columns = ['site_23709']\n",
    "\n",
    "# Site 37063\n",
    "sql = ('SELECT sample_id, value '\n",
    "       'FROM resa2.water_chemistry_values2 '\n",
    "       'WHERE method_id = 10253 '\n",
    "       'AND sample_id IN (SELECT water_sample_id '\n",
    "                         'FROM resa2.water_samples '\n",
    "                         'WHERE station_id = 37063)')\n",
    "df_37063 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Join dates\n",
    "df_37063 = df_37063.merge(df, how='left',\n",
    "                          left_on='sample_id',\n",
    "                          right_on='water_sample_id_37063')\n",
    "\n",
    "# Tidy df\n",
    "df_37063.sort_values(by='sample_date', inplace=True, ascending='True')\n",
    "df_37063.index = df_37063['sample_date']\n",
    "df_37063 = df_37063[['value']]\n",
    "df_37063.columns = ['site_37063']\n",
    "\n",
    "# Merge\n",
    "merged_df = df_23709.merge(df_37063, how='outer',\n",
    "                           left_index=True,\n",
    "                           right_index=True)\n",
    "\n",
    "print 'Total number of samples in merged series:', len(merged_df)\n",
    "\n",
    "# Plot\n",
    "merged_df.plot(figsize=(12, 8))\n",
    "plt.title('Chloride concentration in mg/l at two Mud Ponds', fontsize=20)\n",
    "plt.tight_layout()\n",
    "mpld3.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It's pretty clear that this is the same data series (although something weird happens in 1996 - use the pan and zoom tools to see this). However, there appears to be no consistent pattern regarding which samples have been assigned to which site. The 1996 discrepancy seems pretty minor, so the best way to fix this is probably as follows:\n",
    "\n",
    " * For dates where samples are recorded at both sites, choose data from site 37063 in preference to those from site 23709.\n",
    " * On dates where there are samples *only* for site 23709, transfer these to site 37063. \n",
    "\n",
    "This should create a single series for site 37063 with 132 water samples. Any remaining samples associated with site 23709 can then be deleted (as they're already duplicated for site 37063) and then, with a bit of luck, I should also be able to remove the \"bad\" Mud Pond from the database without inflicting too much damage elsewhere."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Identify samples associated ONLY with site 23709\n",
    "trans_df = df[pd.isnull(df['station_id_37063'])]\n",
    "\n",
    "# These need transferring to site 37063\n",
    "sql = ('UPDATE resa2.water_samples '\n",
    "       'SET station_id = 37063 '\n",
    "       'WHERE water_sample_id IN %s' \n",
    "       % str(tuple(trans_df['water_sample_id_23709'].map(int))))\n",
    "\n",
    "result = conn.execute(sql)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now plot the results for chloride again to make sure it's worked."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "\n",
       "<style>\n",
       "\n",
       "</style>\n",
       "\n",
       "<div id=\"fig_el28325280761925032618197\"></div>\n",
       "<script>\n",
       "function mpld3_load_lib(url, callback){\n",
       "  var s = document.createElement('script');\n",
       "  s.src = url;\n",
       "  s.async = true;\n",
       "  s.onreadystatechange = s.onload = callback;\n",
       "  s.onerror = function(){console.warn(\"failed to load library \" + url);};\n",
       "  document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
       "}\n",
       "\n",
       "if(typeof(mpld3) !== \"undefined\" && mpld3._mpld3IsLoaded){\n",
       "   // already loaded: just create the figure\n",
       "   !function(mpld3){\n",
       "       \n",
       "       mpld3.draw_figure(\"fig_el28325280761925032618197\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.020681086761436651], \"rotation\": -0.0, \"id\": \"el283235836200\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -1.310242839946586], \"rotation\": -0.0, \"id\": \"el2832531165760\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23709\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93281878222339532], \"rotation\": -0.0, \"id\": \"el2832531090008\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531089504\"}], \"sharey\": [\"el2832529523880\"], \"sharex\": [\"el2832529523880\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": \"\", \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832527905456\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531090792\"}], \"markers\": [], \"id\": \"el2832528573440\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.54995348622748264, 0.95538773148148148, 0.4197340137725174]}, {\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.12840567716457507], \"rotation\": -0.0, \"id\": \"el2832529601256\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -0.21941150689035496], \"rotation\": -0.0, \"id\": \"el2832531169064\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.9328187822233952], \"rotation\": -0.0, \"id\": \"el2832531166656\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531166208\"}], \"sharey\": [\"el2832528573440\"], \"sharex\": [\"el2832528573440\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832531088272\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531167440\"}], \"markers\": [], \"id\": \"el2832529523880\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965048, 0.95538773148148148, 0.4197340137725174]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9472955429564011, 0.947295542956401], [0.919710947295734, 0.9472955429564011, 0.947295542956401]], \"data03\": [[0.8906360734382857, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9793189132385633, 0.9793189132385635], [0.8906360734382857, 0.9793189132385633, 0.9793189132385635]], \"data01\": [[723666.0, NaN, 2.02056008507621], [723732.0, NaN, 2.26869904289259], [723833.0, NaN, 2.76497695852535], [723870.0, NaN, 3.2258064516129], [724021.0, NaN, 3.01311591634172], [724098.0, NaN, 2.72952853598015], [724208.0, NaN, 2.65863169088976], [724393.0, NaN, 3.04856433888692], [724460.0, NaN, 2.48138957816377], [724587.0, NaN, 2.87132222616094], [724743.0, NaN, 3.54484225451967], [724832.0, NaN, 3.70436015597306], [724953.0, NaN, 3.93477490251684], [725124.0, NaN, 3.54484225451967], [725196.0, NaN, 3.58029067706487], [725320.0, NaN, 3.40304856433889], [725484.0, NaN, 3.2258064516129], [725565.0, NaN, 3.19035802906771], [725676.0, NaN, 3.58029067706487], [725847.0, NaN, 3.10173697270471], [725937.0, NaN, 3.11946118397731], [726060.0, NaN, 3.15490960652251], [726218.0, NaN, 3.10173697270471], [726292.0, NaN, 2.37504431052818], [726420.0, NaN, 3.2967032967033], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el2832528076192\"});\n",
       "   }(mpld3);\n",
       "}else if(typeof define === \"function\" && define.amd){\n",
       "   // require.js is available: use it to load d3/mpld3\n",
       "   require.config({paths: {d3: \"https://mpld3.github.io/js/d3.v3.min\"}});\n",
       "   require([\"d3\"], function(d3){\n",
       "      window.d3 = d3;\n",
       "      mpld3_load_lib(\"https://mpld3.github.io/js/mpld3.v0.2.js\", function(){\n",
       "         \n",
       "         mpld3.draw_figure(\"fig_el28325280761925032618197\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.020681086761436651], \"rotation\": -0.0, \"id\": \"el283235836200\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -1.310242839946586], \"rotation\": -0.0, \"id\": \"el2832531165760\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23709\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93281878222339532], \"rotation\": -0.0, \"id\": \"el2832531090008\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531089504\"}], \"sharey\": [\"el2832529523880\"], \"sharex\": [\"el2832529523880\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": \"\", \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832527905456\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531090792\"}], \"markers\": [], \"id\": \"el2832528573440\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.54995348622748264, 0.95538773148148148, 0.4197340137725174]}, {\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.12840567716457507], \"rotation\": -0.0, \"id\": \"el2832529601256\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -0.21941150689035496], \"rotation\": -0.0, \"id\": \"el2832531169064\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.9328187822233952], \"rotation\": -0.0, \"id\": \"el2832531166656\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531166208\"}], \"sharey\": [\"el2832528573440\"], \"sharex\": [\"el2832528573440\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832531088272\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531167440\"}], \"markers\": [], \"id\": \"el2832529523880\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965048, 0.95538773148148148, 0.4197340137725174]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9472955429564011, 0.947295542956401], [0.919710947295734, 0.9472955429564011, 0.947295542956401]], \"data03\": [[0.8906360734382857, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9793189132385633, 0.9793189132385635], [0.8906360734382857, 0.9793189132385633, 0.9793189132385635]], \"data01\": [[723666.0, NaN, 2.02056008507621], [723732.0, NaN, 2.26869904289259], [723833.0, NaN, 2.76497695852535], [723870.0, NaN, 3.2258064516129], [724021.0, NaN, 3.01311591634172], [724098.0, NaN, 2.72952853598015], [724208.0, NaN, 2.65863169088976], [724393.0, NaN, 3.04856433888692], [724460.0, NaN, 2.48138957816377], [724587.0, NaN, 2.87132222616094], [724743.0, NaN, 3.54484225451967], [724832.0, NaN, 3.70436015597306], [724953.0, NaN, 3.93477490251684], [725124.0, NaN, 3.54484225451967], [725196.0, NaN, 3.58029067706487], [725320.0, NaN, 3.40304856433889], [725484.0, NaN, 3.2258064516129], [725565.0, NaN, 3.19035802906771], [725676.0, NaN, 3.58029067706487], [725847.0, NaN, 3.10173697270471], [725937.0, NaN, 3.11946118397731], [726060.0, NaN, 3.15490960652251], [726218.0, NaN, 3.10173697270471], [726292.0, NaN, 2.37504431052818], [726420.0, NaN, 3.2967032967033], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el2832528076192\"});\n",
       "      });\n",
       "    });\n",
       "}else{\n",
       "    // require.js not available: dynamically load d3 & mpld3\n",
       "    mpld3_load_lib(\"https://mpld3.github.io/js/d3.v3.min.js\", function(){\n",
       "         mpld3_load_lib(\"https://mpld3.github.io/js/mpld3.v0.2.js\", function(){\n",
       "                 \n",
       "                 mpld3.draw_figure(\"fig_el28325280761925032618197\", {\"axes\": [{\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.020681086761436651], \"rotation\": -0.0, \"id\": \"el283235836200\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -1.310242839946586], \"rotation\": -0.0, \"id\": \"el2832531165760\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_23709\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.93281878222339532], \"rotation\": -0.0, \"id\": \"el2832531090008\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531089504\"}], \"sharey\": [\"el2832529523880\"], \"sharex\": [\"el2832529523880\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": \"\", \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832527905456\"}, {\"color\": \"#4C72B0\", \"yindex\": 1, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531090792\"}], \"markers\": [], \"id\": \"el2832528573440\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.54995348622748264, 0.95538773148148148, 0.4197340137725174]}, {\"xlim\": [723666.0, 734811.0], \"yscale\": \"linear\", \"axesbg\": \"#EAEAF2\", \"texts\": [{\"v_baseline\": \"hanging\", \"h_anchor\": \"middle\", \"color\": \"#262626\", \"text\": \"sample_date\", \"coordinates\": \"axes\", \"zorder\": 3, \"alpha\": 1, \"fontsize\": 11.0, \"position\": [0.5, -0.12840567716457507], \"rotation\": -0.0, \"id\": \"el2832529601256\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"None\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 12.0, \"position\": [-0.030995026985117299, -0.21941150689035496], \"rotation\": -0.0, \"id\": \"el2832531169064\"}, {\"v_baseline\": \"auto\", \"h_anchor\": \"start\", \"color\": \"#262626\", \"text\": \"site_37063\", \"coordinates\": \"axes\", \"zorder\": 1000003.0, \"alpha\": 1, \"fontsize\": 10.0, \"position\": [0.9294025719148834, 0.9328187822233952], \"rotation\": -0.0, \"id\": \"el2832531166656\"}], \"zoomable\": true, \"images\": [], \"xdomain\": [[1982, 4, 1, 0, 0, 0, 0.0], [2012, 10, 4, 0, 0, 0, 0.0]], \"ylim\": [1.0, 7.0], \"paths\": [{\"edgecolor\": \"#FFFFFF\", \"facecolor\": \"#EAEAF2\", \"edgewidth\": 0.3, \"pathcodes\": [\"M\", \"L\", \"L\", \"L\", \"Z\"], \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000001.0, \"alpha\": 1, \"xindex\": 0, \"data\": \"data03\", \"id\": \"el2832531166208\"}], \"sharey\": [\"el2832528573440\"], \"sharex\": [\"el2832528573440\"], \"axesbgalpha\": null, \"axes\": [{\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"bottom\", \"nticks\": 8, \"tickvalues\": null}, {\"scale\": \"linear\", \"tickformat\": null, \"grid\": {\"color\": \"#FFFFFF\", \"alpha\": 1.0, \"dasharray\": \"10,0\", \"gridOn\": true}, \"fontsize\": 10.0, \"position\": \"left\", \"nticks\": 7, \"tickvalues\": null}], \"lines\": [{\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"data\", \"dasharray\": \"10,0\", \"zorder\": 2, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data01\", \"id\": \"el2832531088272\"}, {\"color\": \"#55A868\", \"yindex\": 2, \"coordinates\": \"axes\", \"dasharray\": \"10,0\", \"zorder\": 1000002.0, \"alpha\": 1, \"xindex\": 0, \"linewidth\": 1.75, \"data\": \"data02\", \"id\": \"el2832531167440\"}], \"markers\": [], \"id\": \"el2832529523880\", \"ydomain\": [1.0, 7.0], \"collections\": [], \"xscale\": \"date\", \"bbox\": [0.029612268518518517, 0.092094472454965048, 0.95538773148148148, 0.4197340137725174]}], \"height\": 640.0, \"width\": 960.0, \"plugins\": [{\"type\": \"reset\"}, {\"enabled\": false, \"button\": true, \"type\": \"zoom\"}, {\"enabled\": false, \"button\": true, \"type\": \"boxzoom\"}], \"data\": {\"data02\": [[0.8954818857478604, 0.9472955429564011, 0.947295542956401], [0.919710947295734, 0.9472955429564011, 0.947295542956401]], \"data03\": [[0.8906360734382857, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9088287215801285, 0.9088287215801284], [0.9939427346130317, 0.9793189132385633, 0.9793189132385635], [0.8906360734382857, 0.9793189132385633, 0.9793189132385635]], \"data01\": [[723666.0, NaN, 2.02056008507621], [723732.0, NaN, 2.26869904289259], [723833.0, NaN, 2.76497695852535], [723870.0, NaN, 3.2258064516129], [724021.0, NaN, 3.01311591634172], [724098.0, NaN, 2.72952853598015], [724208.0, NaN, 2.65863169088976], [724393.0, NaN, 3.04856433888692], [724460.0, NaN, 2.48138957816377], [724587.0, NaN, 2.87132222616094], [724743.0, NaN, 3.54484225451967], [724832.0, NaN, 3.70436015597306], [724953.0, NaN, 3.93477490251684], [725124.0, NaN, 3.54484225451967], [725196.0, NaN, 3.58029067706487], [725320.0, NaN, 3.40304856433889], [725484.0, NaN, 3.2258064516129], [725565.0, NaN, 3.19035802906771], [725676.0, NaN, 3.58029067706487], [725847.0, NaN, 3.10173697270471], [725937.0, NaN, 3.11946118397731], [726060.0, NaN, 3.15490960652251], [726218.0, NaN, 3.10173697270471], [726292.0, NaN, 2.37504431052818], [726420.0, NaN, 3.2967032967033], [726575.0, 3.11946118397731, 3.1199], [726664.0, 2.76497695852535, 2.7654], [726787.0, 3.47394540942928, 3.4744], [726946.0, 2.23325062034739, 2.2336], [727030.0, 2.09145693016661, 2.0918], [727149.0, 3.01311591634172, 3.0135], [727317.0, 3.36760014179369, 3.3681], [727401.0, 3.04856433888692, 3.049], [727524.0, 3.2967032967033, 3.2972], [727659.0, NaN, 3.4035], [727669.0, NaN, 2.5881], [727676.0, NaN, 2.8008], [727681.0, NaN, 3.0135], [727690.0, 3.2258064516129, 3.2263], [727763.0, 3.36760014179369, 3.3681], [727877.0, 4.05884438142503, 4.0062], [728048.0, 3.2612548741581, 3.2617], [728130.0, 2.65863169088976, 2.659], [728248.0, 3.2789790854307, 3.2972], [728381.0, NaN, 2.8363], [728388.0, NaN, 2.6945], [728393.0, NaN, 2.659], [728400.0, NaN, 2.3399], [728410.0, 2.94221907125133, 2.9426], [728501.0, 2.1446295639844, 2.1272], [728546.0, 2.51683800070897, 2.5172], [728614.0, 4.324707550514, 3.6163], [728869.0, 2.41049273307338, 2.3825], [728975.0, 3.54484225451967, 3.5453], [729120.0, NaN, 2.8363], [729127.0, NaN, 2.7299], [729134.0, NaN, 2.3754], [729138.0, NaN, 2.5526], [729149.0, 2.94221907125133, 2.9072], [729235.0, 3.04856433888692, 3.049], [729344.0, 3.63346331088267, 3.6517], [729453.0, NaN, 2.0138], [729477.0, NaN, 1.7691], [729483.0, NaN, 1.8152], [729491.0, NaN, 2.4995], [729502.0, 2.2155264090748, 2.1946], [729600.0, 2.44594115561857, 2.4463], [729709.0, 2.92449485997873, 2.9426], [729715.0, NaN, 2.9072], [729813.0, NaN, 3.5099], [729865.0, 3.54484225451967, 3.5453], [729956.0, 3.58029067706487, 3.5808], [730067.0, 4.53739808578518, 4.538], [730070.0, NaN, 5.3535], [730197.0, NaN, 3.8644], [730208.0, NaN, 3.7935], [730215.0, NaN, 4.2899], [730222.0, NaN, 4.219], [730236.0, NaN, 3.9708], [730325.0, NaN, 4.4317], [730426.0, 4.41332860687699, 4.3962], [730559.0, NaN, 4.4671], [730564.0, NaN, 4.5735], [730573.0, NaN, 4.219], [730591.0, NaN, 2.9781], [730613.0, 3.11946118397731, 3.1199], [730691.0, 3.2612548741581, 3.2617], [730774.0, 4.00567174760723, 4.0417], [730822.0, NaN, 6.3462], [730913.0, NaN, 3.6872], [730927.0, NaN, 3.8999], [730934.0, NaN, 4.0417], [730948.0, NaN, 4.0417], [730956.0, NaN, 3.5808], [730972.0, 3.40304856433889, 3.4035], [731056.0, 3.2612548741581, 3.2617], [731158.0, 4.04112017015243, 4.0417], [731165.0, NaN, 5.4598], [731298.0, NaN, 2.2477], [731306.0, NaN, 1.7408], [731319.0, NaN, 1.8861], [731328.0, NaN, 2.3789], [731333.0, NaN, 3.0029], [731370.0, 2.1978021978022, 2.1946], [731427.0, 2.37504431052818, 2.3825], [731529.0, 4.07656859269763, 4.0807], [731707.0, NaN, 3.5666], [731799.0, 3.1, 3.1057], [731893.0, 3.4, 3.4106], [732271.0, 2.2, 2.2371], [732517.0, 2.2, 2.1627], [732619.0, 3.4, 3.4319], [732811.0, 2.1, 2.1343], [732880.0, 1.8, 1.801], [732991.0, 2.9, 2.9107], [733135.0, NaN, 1.9039], [733142.0, NaN, 1.7549], [733152.0, NaN, 2.0598], [733174.0, 2.30414746543779, 2.2974], [733243.0, 2.41049273307338, 2.425], [733345.0, 2.48138957816377, 2.4853], [733622.0, 1.84844452321872, 1.8471], [733719.0, 2.7499248493442, 2.7512], [733887.0, 2.5, 2.5385], [733979.0, 1.9, 1.9003], [734077.0, 3.3, 3.2759], [734272.0, 2.9, 2.9143], [734329.0, 2.8, 2.8185], [734433.0, 3.8, 3.7793], [734619.0, 2.9, 2.9355], [734713.0, 2.3, 2.3257], [734811.0, 3.3, 3.2688]]}, \"id\": \"el2832528076192\"});\n",
       "            })\n",
       "         });\n",
       "}\n",
       "</script>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Site 23709\n",
    "# Get dates\n",
    "sql = ('SELECT water_sample_id, sample_date '\n",
    "       'FROM resa2.water_samples '\n",
    "       'WHERE station_id = 23709')\n",
    "dates_23709 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "sql = ('SELECT sample_id, value '\n",
    "       'FROM resa2.water_chemistry_values2 '\n",
    "       'WHERE method_id = 10253 '\n",
    "       'AND sample_id IN (SELECT water_sample_id '\n",
    "                         'FROM resa2.water_samples '\n",
    "                         'WHERE station_id = 23709)')\n",
    "df_23709 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Join dates\n",
    "df_23709 = df_23709.merge(dates_23709, how='left',\n",
    "                          left_on='sample_id',\n",
    "                          right_on='water_sample_id')\n",
    "\n",
    "# Tidy df\n",
    "df_23709.sort_values(by='sample_date', inplace=True, ascending='True')\n",
    "df_23709.index = df_23709['sample_date']\n",
    "df_23709 = df_23709[['value']]\n",
    "df_23709.columns = ['site_23709']\n",
    "\n",
    "# Site 37063\n",
    "# Get dates\n",
    "sql = ('SELECT water_sample_id, sample_date '\n",
    "       'FROM resa2.water_samples '\n",
    "       'WHERE station_id = 37063')\n",
    "dates_37063 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "sql = ('SELECT sample_id, value '\n",
    "       'FROM resa2.water_chemistry_values2 '\n",
    "       'WHERE method_id = 10253 '\n",
    "       'AND sample_id IN (SELECT water_sample_id '\n",
    "                         'FROM resa2.water_samples '\n",
    "                         'WHERE station_id = 37063)')\n",
    "df_37063 = pd.read_sql_query(sql, engine)\n",
    "\n",
    "# Join dates\n",
    "df_37063 = df_37063.merge(dates_37063, how='left',\n",
    "                          left_on='sample_id',\n",
    "                          right_on='water_sample_id')\n",
    "\n",
    "# Tidy df\n",
    "df_37063.sort_values(by='sample_date', inplace=True, ascending='True')\n",
    "df_37063.index = df_37063['sample_date']\n",
    "df_37063 = df_37063[['value']]\n",
    "df_37063.columns = ['site_37063']\n",
    "\n",
    "# Merge\n",
    "merged_df = df_23709.merge(df_37063, how='outer',\n",
    "                           left_index=True,\n",
    "                           right_index=True)\n",
    "\n",
    "# Plot\n",
    "merged_df.plot(subplots=True, sharey=True, figsize=(12, 8))\n",
    "plt.tight_layout()\n",
    "mpld3.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This looks OK, so I can now delete all the remaining records associated with `station_id=23709`...\n",
    "\n",
    "...Actually, I can't delete these samples. They're protected by the database constraint `SS_WSID_FK`, which links `water_samples` to `sample_selections`. I'm not sure what the `sample_selections` tables is, and I'm wary of deleting records from tables I don't understand in case it breaks some other component of the wider LabWare-RESA2-NivaBase system.\n",
    "\n",
    "For now, I'll leave site 23709 in the database, along with the remaining samples that are associated with it. All i've actually done here is transfer 25 water samples from the early record of station 23709 to station 37063. It won't be difficult to undo these changes if necessary, but it would be good to tidy this up properly. **Come back to this** and find out what the `sample_selections` table actually does.\n",
    "\n",
    "### 1.4. The Mystery of Multiple sites\n",
    "\n",
    "It seems the Mystery of Mud Pond runs a bit deeper than originally thought. Many of the ICPW sites in RESA2 are duplicated, with the following as a typical example:\n",
    "\n",
    "| station_id | station_code |       station_name       |   lat   |   long   |\n",
    "|:----------:|:------------:|:------------------------:|:-------:|:--------:|\n",
    "|    23661   |     US26     | Biscuit Brook, Catskills | 41.9869 | -74.5031 |\n",
    "|    37014   |  X15:1434025 |       BISCUIT BROOK      | 42.0111 | -74.4147 |\n",
    "\n",
    "The first entry (`station_id 23661`) is associated with the \"core\" ICP Waters programme, whereas the second (`station_id 37014`) was added later as part of the trend analysis project. Despite the slight differences in co-ordinates, **these are actually the same site**.\n",
    "\n",
    "Adding stations twice is a little dangerous in a relational database, as it breaks the principle of database normalisation. Nevertheless, as Tore points out in his e-mail of 01/09/2016 at 08:51, it is sometimes justified as the most convenient thing to do. There are some potential problems with this, though. For example, when John and Don were here in May, they extracted site data for both the `ICPWaters US` and `ICPW_TOCTRENDS_2015_US_LTM` projects and were understandably confused by the different site codes and co-ordinates. More importantly, **when sites are duplicated, it's very easy to forget to update both versions when new data comes in**. The result, as illustrated by Mud Pond above, is we end up with two patchy data records, neither of which tell the whole story by themselves. This wouldn't be a problem if the `ICPWaters XX` projects (and the associated sites e.g. US26 in the table above) were \"archived\" to represent the state of the data at the time of some previous analysis. Any more recent data would then be added to a new project. However, as far as I can tell, this isn't the case: each year new data from the Focal Centres are added to the `ICPWaters XX` sites, while `ICPW_TOCTRENDS_2015_XX` is more of a \"static\" dataset, based on a bulk data upload at a single point in time.\n",
    "\n",
    "For the US sites, John has produced a spreadsheet that attempts to sort all this out, by linking the RESA2 codes used by `ICPWaters` to USEPA codes and corrected site properties. The main question now is **how to use this information**.\n",
    "\n",
    "Given that the `ICPWaters XX` projects do not represent an accurate archive of the data at some previous time point (because more data has been added subsequently), and considering that many of the latitudes and longitudes for these stations are now known to be incorrect, there doesn't seem much justification for keeping the duplicate sites. However, because the water samples associated with the individual sites in each pair are different (see the Mud Pond example above), it's not necessarily straightforward to merge the duplicated datasets. A simpler option would be to keep the sites separate but to correct/update the site properties (names, co-ordinates etc.) based on John's spreadsheet, so at least the duplicated sites are *consistently* duplicated. This would require less effort, but will likely lead to further confusion in the future, and it also means the updated trends analysis will not make use of the full dataset.\n",
    "\n",
    "#### 1.4.1. Which US sites are in the database?\n",
    "\n",
    "Rather than worrying about which projects have which sites, a useful preliminary step will be to simply check which of the sites in John's spreadsheet are actually in the database, and whether the properties are correct. Even this isn't entirely straightforward, but I've had a go using the code below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>station_id</th>\n",
       "      <th>station_code</th>\n",
       "      <th>station_name</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>110</td>\n",
       "      <td>1845-601</td>\n",
       "      <td>Tennvatn</td>\n",
       "      <td>67.593796</td>\n",
       "      <td>15.487890</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>111</td>\n",
       "      <td>604-608</td>\n",
       "      <td>�vre Jerpetjern</td>\n",
       "      <td>59.608000</td>\n",
       "      <td>9.427000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>112</td>\n",
       "      <td>1101-43</td>\n",
       "      <td>Glypstadvatnet</td>\n",
       "      <td>58.487223</td>\n",
       "      <td>6.188384</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>113</td>\n",
       "      <td>2030-701</td>\n",
       "      <td>Serdivatnet</td>\n",
       "      <td>69.606075</td>\n",
       "      <td>30.374109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>115</td>\n",
       "      <td>831-501</td>\n",
       "      <td>Brårvatn</td>\n",
       "      <td>59.294921</td>\n",
       "      <td>7.727118</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   station_id station_code     station_name   latitude  longitude\n",
       "0         110     1845-601         Tennvatn  67.593796  15.487890\n",
       "1         111      604-608  �vre Jerpetjern  59.608000   9.427000\n",
       "2         112      1101-43   Glypstadvatnet  58.487223   6.188384\n",
       "3         113     2030-701      Serdivatnet  69.606075  30.374109\n",
       "4         115      831-501         Brårvatn  59.294921   7.727118"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Read stations table\n",
    "sql = ('SELECT station_id, station_code, station_name, latitude, longitude '\n",
    "       'FROM resa2.stations')\n",
    "\n",
    "stn_df = pd.read_sql_query(sql, engine)\n",
    "stn_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>station_code</th>\n",
       "      <th>station_name</th>\n",
       "      <th>suggested_name</th>\n",
       "      <th>active</th>\n",
       "      <th>core</th>\n",
       "      <th>trend</th>\n",
       "      <th>nfc_code</th>\n",
       "      <th>nfc_name</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>US24</td>\n",
       "      <td>New York, Catskill Mnt., Rondout Creek</td>\n",
       "      <td>Rondout Creek, Catskills</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>01364959</td>\n",
       "      <td>RONDOUT CREEK</td>\n",
       "      <td>41.9367</td>\n",
       "      <td>-74.3764</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>US25</td>\n",
       "      <td>W Br Neversink R At Winnisook, Catskills</td>\n",
       "      <td>West Branch Neversink River at Winnisook, Cats...</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>01434021</td>\n",
       "      <td>W BR NEVERSINK R AT WINNISOOK</td>\n",
       "      <td>41.9725</td>\n",
       "      <td>-74.4483</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>US26</td>\n",
       "      <td>Biscuit Brook, Catskills</td>\n",
       "      <td>Biscuit Brook, Catskills</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>01434025</td>\n",
       "      <td>BISCUIT BROOK</td>\n",
       "      <td>42.0111</td>\n",
       "      <td>-74.4147</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>US23</td>\n",
       "      <td>New York, Catskill Mnt., E. Branch Neversink, ...</td>\n",
       "      <td>East Branch Neversink River, Catskills</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>0143400680</td>\n",
       "      <td>EAST BRANCH NEVERSINK, HEADWAT</td>\n",
       "      <td>41.9869</td>\n",
       "      <td>-74.5031</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>US27</td>\n",
       "      <td>Little Hope Pond, Adirondacks</td>\n",
       "      <td>Little Hope Pond, Adirondacks</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>NO, road salt contamination</td>\n",
       "      <td>020058O</td>\n",
       "      <td>LITTLE HOPE POND</td>\n",
       "      <td>44.5158</td>\n",
       "      <td>-74.1252</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  station_code                                       station_name  \\\n",
       "0         US24             New York, Catskill Mnt., Rondout Creek   \n",
       "1         US25           W Br Neversink R At Winnisook, Catskills   \n",
       "2         US26                           Biscuit Brook, Catskills   \n",
       "3         US23  New York, Catskill Mnt., E. Branch Neversink, ...   \n",
       "4         US27                      Little Hope Pond, Adirondacks   \n",
       "\n",
       "                                      suggested_name active core  \\\n",
       "0                           Rondout Creek, Catskills    YES  YES   \n",
       "1  West Branch Neversink River at Winnisook, Cats...    YES  YES   \n",
       "2                           Biscuit Brook, Catskills    YES  YES   \n",
       "3             East Branch Neversink River, Catskills    YES  YES   \n",
       "4                      Little Hope Pond, Adirondacks    YES  YES   \n",
       "\n",
       "                         trend    nfc_code                        nfc_name  \\\n",
       "0                          YES    01364959                   RONDOUT CREEK   \n",
       "1                          YES    01434021   W BR NEVERSINK R AT WINNISOOK   \n",
       "2                          YES    01434025                   BISCUIT BROOK   \n",
       "3                          YES  0143400680  EAST BRANCH NEVERSINK, HEADWAT   \n",
       "4  NO, road salt contamination     020058O                LITTLE HOPE POND   \n",
       "\n",
       "   latitude  longitude  \n",
       "0   41.9367   -74.3764  \n",
       "1   41.9725   -74.4483  \n",
       "2   42.0111   -74.4147  \n",
       "3   41.9869   -74.5031  \n",
       "4   44.5158   -74.1252  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Get a subset of columns from John's spreadsheet\n",
    "us_df = j_df[['Station Code', 'Station name', 'Suggested Name',\n",
    "              'Active', 'INCLUDE IN ICP WATERS DATABASE', \n",
    "              'INCLUDE IN ICP DOC ANALYSIS', 'NFC_SITEID',\n",
    "              'NFC_SITENAME', 'Latitude', 'Longitude']]\n",
    "\n",
    "# Rename columns\n",
    "us_df.columns = ['station_code', 'station_name', 'suggested_name',\n",
    "                 'active', 'core', 'trend', 'nfc_code', 'nfc_name', \n",
    "                 'latitude', 'longitude']\n",
    "\n",
    "us_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Data\\64_Bit_WinPython\\python-2.7.10.amd64\\lib\\site-packages\\ipykernel\\__main__.py:31: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>stn_code</th>\n",
       "      <th>code_present</th>\n",
       "      <th>name_ok</th>\n",
       "      <th>lat_ok</th>\n",
       "      <th>lon_ok</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>US24</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>US25</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>US26</td>\n",
       "      <td>1</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>US23</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>US27</td>\n",
       "      <td>1</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  stn_code  code_present name_ok lat_ok lon_ok\n",
       "0     US24             1   False  False  False\n",
       "1     US25             1   False  False  False\n",
       "2     US26             1    True  False  False\n",
       "3     US23             1   False  False  False\n",
       "4     US27             1    True   True   True"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Dict to store results\n",
    "res_dict = {'stn_code':[],\n",
    "            'code_present':[],\n",
    "            'name_ok':[],\n",
    "            'lat_ok':[],\n",
    "            'lon_ok':[]}\n",
    "\n",
    "# Iterate over spreadsheet\n",
    "for idx in range(len(us_df)):\n",
    "    # Get stn id\n",
    "    stn_cd = us_df.ix[idx, 'station_code']\n",
    "    name = us_df.ix[idx, 'suggested_name'] # These are John's ideal names\n",
    "    lat = us_df.ix[idx, 'latitude']\n",
    "    lon = us_df.ix[idx, 'longitude']\n",
    "    \n",
    "    # Check stations table\n",
    "    q_res = stn_df.query('station_code == @stn_cd')\n",
    "    \n",
    "    if len(q_res) == 0:\n",
    "        # The site isn't present with this code\n",
    "        res_dict['stn_code'].append(stn_cd)\n",
    "        res_dict['code_present'].append(0)\n",
    "        res_dict['name_ok'].append(np.nan)\n",
    "        res_dict['lat_ok'].append(np.nan)\n",
    "        res_dict['lon_ok'].append(np.nan)\n",
    "    \n",
    "    elif len(q_res) == 1:\n",
    "        # Check the site properties\n",
    "        res_dict['stn_code'].append(stn_cd)\n",
    "        res_dict['code_present'].append(1)\n",
    "        res_dict['name_ok'].append(name == q_res.station_name.values[0])\n",
    "        res_dict['lat_ok'].append(lat == q_res.latitude.values[0])\n",
    "        res_dict['lon_ok'].append(lon == q_res.longitude.values[0])\n",
    "        \n",
    "    else:\n",
    "        # This should be impossible\n",
    "        raise ValueError('Site code %s appears to be duplicated.' % stn_cd)\n",
    "\n",
    "# Build df \n",
    "res_df = pd.DataFrame(res_dict)\n",
    "res_df = res_df[['stn_code', 'code_present', \n",
    "                 'name_ok', 'lat_ok', 'lon_ok']]\n",
    "\n",
    "res_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note the Unicode warning here, highlighting that some matches on the `name` column have failed due to issues with special text characters, so the results are probably a worst case scenario. Nevertheless, some of the sites match exactly. Let's see how many matches and partial matches we have."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total number of records in spreadsheet:                110\n",
      "Of these...\n",
      "Number matched by code:                                95\n",
      "Number matched by code with correct lat and lon:       67\n",
      "Number matched by code with correct name, lat and lon: 58\n"
     ]
    }
   ],
   "source": [
    "# How many matches?\n",
    "print 'Total number of records in spreadsheet:               ', len(res_df)\n",
    "print 'Of these...'\n",
    "print 'Number matched by code:                               ', len(res_df.query('code_present==1'))\n",
    "print 'Number matched by code with correct lat and lon:      ', len(res_df.query('lat_ok==1 & lon_ok==1'))\n",
    "print 'Number matched by code with correct name, lat and lon:', len(res_df.query('name_ok==1 & lat_ok==1 & lon_ok==1'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The script above uses RESA2 codes taken from John's spreadsheet. For the most part these are the old codes (e.g. `US26`) associated with the original `ICPWaters US` project (rather than the more recent codes, such as `X15:1434025`, used in the trends project). It seems that 58 of these sites have exactly the correct information (name, lat and long), and a further 9 have the correct lat and long, some do not use John's suggested ideal names. A further 28 sites can be matched by code, but have incorrect geographic co-ordinates.\n",
    "\n",
    "In total, there are 95 matches based on site codes. A further 15 sites (the ones marked `NA` in John's spreadsheet) cannot be matched at all but, of these, only 4 should be included in the core ICPW dataset and only 2 in the trends analysis. These four sites have already been dealt with above (see the start of section 1.2). They *are* in the database with suitable codes - it's just that these codes are in John's workbook.\n",
    "\n",
    "What about the sites created as duplicates for the trend analysis project? How many of those can be matched from John's spreadsheet?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>nfc_code</th>\n",
       "      <th>code_present</th>\n",
       "      <th>name_ok</th>\n",
       "      <th>lat_ok</th>\n",
       "      <th>lon_ok</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>01364959</td>\n",
       "      <td>1</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>01434021</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>01434025</td>\n",
       "      <td>1</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0143400680</td>\n",
       "      <td>1</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>020058O</td>\n",
       "      <td>1</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     nfc_code  code_present name_ok lat_ok lon_ok\n",
       "0    01364959             1    True   True   True\n",
       "1    01434021             1   False   True   True\n",
       "2    01434025             1    True   True   True\n",
       "3  0143400680             1   False   True   True\n",
       "4     020058O             1    True   True   True"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Dict to store results\n",
    "res_dict = {'nfc_code':[],\n",
    "            'code_present':[],\n",
    "            'name_ok':[],\n",
    "            'lat_ok':[],\n",
    "            'lon_ok':[]}\n",
    "\n",
    "# Iterate over spreadsheet\n",
    "for idx in range(len(us_df)):\n",
    "    # Get stn id\n",
    "    nfc_cd = us_df.ix[idx, 'nfc_code']\n",
    "    name = us_df.ix[idx, 'nfc_name'] # These are John's ideal names\n",
    "    lat = us_df.ix[idx, 'latitude']\n",
    "    lon = us_df.ix[idx, 'longitude']\n",
    "    \n",
    "    # Check stations table. Need to add 'X15:' and allow for variants\n",
    "    q_res = stn_df[(stn_df['station_code']=='X15:%s' % nfc_cd) |\n",
    "                   (stn_df['station_code']=='X15:%s' % nfc_cd[1:]) |\n",
    "                   (stn_df['station_code']=='X15:%s' % nfc_cd[:-1])]\n",
    "    \n",
    "    if len(q_res) == 0:\n",
    "        # The site isn't present with this code\n",
    "        res_dict['nfc_code'].append(nfc_cd)\n",
    "        res_dict['code_present'].append(0)\n",
    "        res_dict['name_ok'].append(np.nan)\n",
    "        res_dict['lat_ok'].append(np.nan)\n",
    "        res_dict['lon_ok'].append(np.nan)\n",
    "    \n",
    "    elif len(q_res) == 1:\n",
    "        # Check the site properties\n",
    "        res_dict['nfc_code'].append(nfc_cd)\n",
    "        res_dict['code_present'].append(1)\n",
    "        res_dict['name_ok'].append(name == q_res.station_name.values[0])\n",
    "        res_dict['lat_ok'].append(lat == q_res.latitude.values[0])\n",
    "        res_dict['lon_ok'].append(lon == q_res.longitude.values[0])\n",
    "        \n",
    "    else:\n",
    "        # This should be impossible\n",
    "        raise ValueError('Site code %s appears to be duplicated.' % stn_cd)\n",
    "\n",
    "# Build df \n",
    "res_df = pd.DataFrame(res_dict)\n",
    "res_df = res_df[['nfc_code', 'code_present', \n",
    "                 'name_ok', 'lat_ok', 'lon_ok']]\n",
    "\n",
    "res_df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Total number of records in spreadsheet:                110\n",
      "Of these...\n",
      "Number matched by code:                                89\n",
      "Number matched by code with correct lat and lon:       74\n",
      "Number matched by code with correct name, lat and lon: 68\n"
     ]
    }
   ],
   "source": [
    "# How many matches?\n",
    "print 'Total number of records in spreadsheet:               ', len(res_df)\n",
    "print 'Of these...'\n",
    "print 'Number matched by code:                               ', len(res_df.query('code_present==1'))\n",
    "print 'Number matched by code with correct lat and lon:      ', len(res_df.query('lat_ok==1 & lon_ok==1'))\n",
    "print 'Number matched by code with correct name, lat and lon:', len(res_df.query('name_ok==1 & lat_ok==1 & lon_ok==1'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This approach manages to match most (but not necessarily all) of the US sites using th `X15:` convention, but note that I had to jump through a few hoops to get here: the code above initially tried to find matches just using the raw NFC code with `X15:` prepended to it, but this failed in many cases. Two causes (there may be others) are:\n",
    "\n",
    " * Leading zeros on the NFC site codes have been truncated upon adding to RESA. For example, station `01364959` is in the database `X15:1364959`. <br><br>\n",
    " \n",
    " * Trailing letter codes have been removed (or weren't included originally) upon adding to RESA2. For example, station `1C1-112E` is in the database as `X15:1C1-112`.\n",
    " \n",
    "Unfortunately, these kinds of discrepancies makes automated cleaning of the database difficult.\n",
    "\n",
    "A final test for the moment is to see whether I've actually managed to find all the stations that John would like including in the trends analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>station_code</th>\n",
       "      <th>station_name</th>\n",
       "      <th>suggested_name</th>\n",
       "      <th>active</th>\n",
       "      <th>core</th>\n",
       "      <th>trend</th>\n",
       "      <th>nfc_code</th>\n",
       "      <th>nfc_name</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>code_present</th>\n",
       "      <th>name_ok</th>\n",
       "      <th>lat_ok</th>\n",
       "      <th>lon_ok</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>65</th>\n",
       "      <td>US74</td>\n",
       "      <td>Mud Pond, Maine</td>\n",
       "      <td>Mud Pond, Maine</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>1E1-134E</td>\n",
       "      <td>MUD POND</td>\n",
       "      <td>44.6330</td>\n",
       "      <td>-68.0908</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74</th>\n",
       "      <td>US82</td>\n",
       "      <td>Newbert Pond, Maine</td>\n",
       "      <td>Newbert Pond, Maine</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>YES</td>\n",
       "      <td>ME-9998E</td>\n",
       "      <td>NEWBERT POND</td>\n",
       "      <td>44.3364</td>\n",
       "      <td>-69.2711</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   station_code         station_name       suggested_name active core trend  \\\n",
       "65         US74      Mud Pond, Maine      Mud Pond, Maine    YES  YES   YES   \n",
       "74         US82  Newbert Pond, Maine  Newbert Pond, Maine    YES  YES   YES   \n",
       "\n",
       "    nfc_code      nfc_name  latitude  longitude  code_present name_ok lat_ok  \\\n",
       "65  1E1-134E      MUD POND   44.6330   -68.0908           NaN     NaN    NaN   \n",
       "74  ME-9998E  NEWBERT POND   44.3364   -69.2711           NaN     NaN    NaN   \n",
       "\n",
       "   lon_ok  \n",
       "65    NaN  \n",
       "74    NaN  "
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter dfs\n",
    "match_df = res_df.query('code_present==1')\n",
    "trend_df = us_df.query('trend == \"YES\"')\n",
    "\n",
    "# Join\n",
    "trend_df = trend_df.merge(match_df, how='left',\n",
    "                          on='nfc_code')\n",
    "\n",
    "\n",
    "# Identify records with no match\n",
    "trend_df[pd.isnull(trend_df['code_present'])]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So I've manged to match all the sites for the trends work, with two exceptions. The first is (the increasingly troublesome) Mud Pond, which hasn't matched because when I tidied it up above I changed the site code for station `37063` back to the original `US74`, rather than the `X15:` version. The second is Newbert Pond, which is in the database as part of the core ICP Waters programme, but apparently hasn't previously been part of the trends analysis.\n",
    "\n",
    "### Summary\n",
    "\n",
    "This exploration proves that, with a bit of work, I can identify the sites that John would like including in each project and clean up their properties. I haven't really begun to tackle the problem of actually integrating the data, though - that's going to be much more challenging. \n",
    "\n",
    "It's also worth emphasising that the above was only really possible thanks to John's spreadsheet, so doing the same thing for other countries may be more difficult."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}