{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Cruise Track Plot](Viz_CruiseTrack.ipynb) | [Index](Index.ipynb) | [Correlation Matrix Along Cruise Track](Viz_CruiseCorrelationMatrix.ipynb) >\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/simonscmap/pycmap/blob/master/docs/Viz_CorrelationMatrix.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Google Colaboratory\"></a>\n",
    "\n",
    "<a href=\"https://mybinder.org/v2/gh/simonscmap/pycmap/master?filepath=docs%2FViz_CorrelationMatrix.ipynb\"><img align=\"right\" src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Binder\"></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## *plot_corr_map(sourceTable, sourceVar, targetTables, targetVars, dt1, dt2, lat1, lat2, lon1, lon2, depth1, depth2, temporalTolerance, latTolerance, lonTolerance, depthTolerance, method='spearman', exportDataFlag=False, show=True)*\n",
    "\n",
    "This function computes and plots the pair-correlation coefficient between the source and target variables. The results are visualized in form of a correlation matrix. To compute the correlations the source and target variables have to be colocalized first (see [Match (colocalize) Datasets](Match.ipynb)). The colocalization procedure relies on the tolerance parameters because they set the matching boundaries between the source and target datasets. Notice the source has to be a single non-climatological variable. In principle, if the source dataset is fully covered by the target variable's spatio-temporal range, there should always be matching results if the tolerance parameters are larger than half of their corresponding spatial/temporal resolutions. Please explore the [catalog](Catalog.ipynb) to find appropriate target  variables. Note that, currently, this visualization is only supported by plotly visualization library.\n",
    "\n",
    "<br />Returns the generated correlation graph object using which one may modify the graph properties (see example below). \n",
    "\n",
    "<br />**Note:**\n",
    "<br />This method requires a valid [API key](API.ipynb). It is not necessary to set the API key every time because the API properties are stored locally after being called the first time.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **Parameters:** \n",
    ">> **sourceTable: string**\n",
    ">>  <br />Table name of the source dataset. A full list of table names can be found in [catalog](Catalog.ipynb).\n",
    ">> <br />\n",
    ">> <br />**sourceVar: string**\n",
    ">>  <br />The source variable short name. The target variables are matched (colocalized) with this variable. A full list of variable short names can be found in [catalog](Catalog.ipynb).\n",
    ">> <br />\n",
    ">> <br />**targetTables: list of string**\n",
    ">>  <br />Table names of the target datasets to be matched with the source data. Notice source dataset can be matched with multiple target datasets. A full list of table names can be found in [catalog](Catalog.ipynb).\n",
    ">> <br />\n",
    ">> <br />**targetVars: list of string**\n",
    ">>  <br />Variable short names to be matched with the source variable. A full list of variable short names can be found in [catalog](Catalog.ipynb).\n",
    ">> <br />\n",
    ">> <br />**dt1: string**\n",
    ">>  <br />Start date or datetime. Both source and target datasets are filtered before matching. This parameter sets the lower bound of the temporal cut.\n",
    ">> <br />\n",
    ">> <br />**dt2: string**\n",
    ">>  <br />End date or datetime. Both source and target datasets are filtered before matching. This parameter sets the upper bound of the temporal cut.\n",
    ">> <br />\n",
    ">> <br />**lat1: float**\n",
    ">>  <br />Start latitude [degree N]. Both source and target datasets are filtered before matching. This parameter sets the lower bound of the meridional cut. Note latitude ranges from -90 to 90 degrees.\n",
    ">> <br />\n",
    ">> <br />**lat2: float**\n",
    ">>  <br />End latitude [degree N]. Both source and target datasets are filtered before matching. This parameter sets the upper bound of the meridional cut. Note latitude ranges from -90 to 90 degrees.\n",
    ">> <br />\n",
    ">> <br />**lon1: float**\n",
    ">>  <br />Start longitude [degree E]. Both source and target datasets are filtered before matching. This parameter sets the lower bound of the zonal cut. Note longitude ranges from -180 to 180 degrees.\n",
    ">> <br />\n",
    ">> <br />**lon2: float**\n",
    ">>  <br />End longitude [degree E]. Both source and target datasets are filtered before matching. This parameter sets the upper bound of the zonal cut. Note longitude ranges from -180 to 180 degrees.\n",
    ">> <br />\n",
    ">> <br />**depth1: float**\n",
    ">>  <br />Start depth [m]. Both source and target datasets are filtered before matching. This parameter sets the lower bound of the vertical cut. Note depth is a positive number (depth is 0 at surface and grows towards ocean floor).\n",
    ">> <br />\n",
    ">> <br />**depth2: float**\n",
    ">>  <br />End depth [m]. Both source and target datasets are filtered before matching. This parameter sets the upper bound of the vertical cut. Note depth is a positive number (depth is 0 at surface and grows towards ocean floor).\n",
    ">> <br />\n",
    ">> <br />**temporalTolerance: list of int**\n",
    ">> <br />Temporal tolerance values between pairs of source and target datasets. The size and order of values in this list should match those of targetTables. If only a single integer value is given, that would be applied to all target datasets. This parameter is in day units except when the target variable represents monthly climatology data in which case it is in month units. Notice fractional values are not supported in the current version.\n",
    ">> <br />\n",
    ">> <br />**latTolerance: list of float or int**\n",
    ">> <br />Spatial tolerance values in meridional direction [deg] between pairs of source and target datasets. The size and order of values in this list should match those of targetTables. If only a single float value is given, that would be applied to all target datasets. A \"safe\" value for this parameter can be slightly larger than the half of the traget variable's spatial resolution.\n",
    ">> <br />\n",
    ">> <br />**lonTolerance: list of float or int**\n",
    ">> <br />Spatial tolerance values in zonal direction [deg] between pairs of source and target datasets. The size and order of values in this list should match those of targetTables. If only a single float value is given, that would be applied to all target datasets. A \"safe\" value for this parameter can be slightly larger than the half of the traget variable's spatial resolution.\n",
    ">> <br />\n",
    ">> <br />**depthTolerance: list of float or int**\n",
    ">> <br />Spatial tolerance values in vertical direction [m] between pairs of source and target datasets. The size and order of values in this list should match those of targetTables. If only a single float value is given, that would be applied to all target datasets. \n",
    ">> <br />\n",
    ">> <br />**method: str, default: 'spearman'**\n",
    ">>  <br />Correlation algorithm. 'spearman' is a rank correlation algorithm and is a metric for monotonic relationships. Other options involve **'pearson'** and **'kendall'**. *'pearson'* is the standard correlation coefficient, more favorable for linear correlations. *'kendall'* evaluates Kendall Tau correlation coefficient.\n",
    ">> <br />\n",
    ">> <br />**exportDataFlag: boolean, default: False**\n",
    ">>  <br />If True, the graph data points are stored on the local machine. The export path and file format are set by the [API's parameters](API.ipynb). \n",
    ">> <br />\n",
    ">> <br />**show: boolean, default: True**\n",
    ">>  <br />If True, the graph object is returned and is displayed. The graph file is saved on the local machine at the [**figureDir**](API.ipynb) directory. \n",
    "<br />If False, the graph object is returned but not displayed. \n",
    "\n",
    "\n",
    ">**Returns: the graph object** \n",
    ">>  Below are the graph's properties and methods.\n",
    ">>> **Properties:**\n",
    ">>>> **x: list of string**\n",
    ">>>>  <br />Correlation matrix column titles (covariate names).\n",
    ">>>> <br />\n",
    ">>>> <br />**y: list of string**\n",
    ">>>>  <br />Correlation matrix row titles (covariate names).\n",
    ">>>> <br />\n",
    ">>>> <br />**z: numpy.ndarray**\n",
    ">>>>  <br />Computed pairwise correlation coefficients.\n",
    ">>>> <br />\n",
    ">>>> <br />**cmap: str or cmocean colormap**\n",
    ">>>>  <br />Colormap name. Any matplotlib (e.g. 'viridis', ..) or cmocean (e.g. cmocean.cm.thermal, ..) colormaps can be passed to this property. A full list of matplotlib and cmocean color palettes can be found at the following links:\n",
    ">>>>  <br />https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html\n",
    ">>>>  <br />https://matplotlib.org/cmocean/\n",
    ">>>> <br />\n",
    ">>>> <br />**vmin: float, default -1**\n",
    ">>>>  <br />This parameter defines the lower bound of the color code values.\n",
    ">>>> <br />\n",
    ">>>> <br />**vmax: float, default +1**\n",
    ">>>>  <br />This parameter defines the upper bound of the color code values.\n",
    ">>>> <br />\n",
    ">>>> <br />**height: int**\n",
    ">>>>  <br />Graph's height in pixels.\n",
    ">>>> <br />\n",
    ">>>> <br />**width: int**\n",
    ">>>>  <br />Graph's width in pixels.\n",
    ">>>> <br />\n",
    ">>>> <br />**title: str**\n",
    ">>>>  <br />The graphs's title.\n",
    "\n",
    ">>> **Methods:**\n",
    ">>>> **render()**\n",
    ">>>>  <br />Displayes the plot according to the set properties. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example\n",
    "\n",
    "In this example the abundance of a prochlorococcus strain (MIT9313PCR, see lines 37-38) measured by [Chisholm lab](https://chisholmlab.mit.edu/) during the AMT13 cruise (Atlantic Meridional Transect Cruise 13) is colocalized with 7 target variables (lines 7-8):<br />\n",
    "* 'MIT9312PCR_Chisholm', 'MED4PCR_Chisholm', and 'sbact_Chisholm' from the same [source dataset](https://cmap.readthedocs.io/en/latest/catalog/datasets/Chisholm_AMT13.html#chisholm-amt13)\n",
    "* 'phosphate_WOA_clim', and 'nitrate_WOA_clim' from [World Ocean Atlas](https://cmap.readthedocs.io/en/latest/catalog/datasets/WOA_climatology.html#woa-clim) monthly climatology dataset\n",
    "* 'chl' from weekly averaged satellite [chlorophyll dataset](https://cmap.readthedocs.io/en/latest/catalog/datasets/Chlorophyll_REP.html#chlorophyll-rep)\n",
    "* 'picoprokaryote' from 3-day averaged [Darwin model](https://cmap.readthedocs.io/en/latest/catalog/datasets/Darwin_3day.html#darwin-3day). Colocalizing this variable will take longer time than others as the 3-day averaged Darwin dataset is massive (multi-decades global 3D dataset)!\n",
    "\n",
    "<br />**Tip1:**<br /> \n",
    "The space-time cut parameters (lines 41-48) have been set in such a way to encompass the entire source dataset 'tblAMT13_Chisholm' (see the [dataset page](https://cmap.readthedocs.io/en/latest/catalog/datasets/Chisholm_AMT13.html#chisholm-amt13) for more details). Notice that the last data point at the source dataset has been measured at '2003-10-12 12:44:00'. For simplicity dt2 has been set to '2003-10-13', but you could also use the exact date-time '2003-10-12 12:44:00'. \n",
    "\n",
    " \n",
    "<br />Please review the **Example 1** at [Match (colocalize) Datasets](Match.ipynb) page since all of the mentioned tips  directly apply to this example too."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install pycmap -q     #uncomment to install pycmap, if necessary\n",
    "# uncomment the lines below if the API key has not been registered on your machine, previously.\n",
    "# import pycmap\n",
    "# pycmap.API(token='<YOUR_API_KEY>', vizEngine='plotly')\n",
    "\n",
    "from collections import namedtuple\n",
    "from pycmap.viz import plot_corr_map\n",
    "\n",
    "\n",
    "\n",
    "def match_params():\n",
    "    Param = namedtuple('Param', ['table', 'variable', 'temporalTolerance', 'latTolerance', 'lonTolerance', 'depthTolerance'])\n",
    "    params = []\n",
    "    ######## self-matching: colocalizing with some other variables in the tblAMT13_Chisholm dataset\n",
    "    params.append(Param('tblAMT13_Chisholm', 'MIT9312PCR_Chisholm', 0, 0, 0, 0))\n",
    "    params.append(Param('tblAMT13_Chisholm', 'MED4PCR_Chisholm', 0, 0, 0, 0))\n",
    "    params.append(Param('tblAMT13_Chisholm', 'sbact_Chisholm', 0, 0, 0, 0))\n",
    "    ####### WOA: World Ocean Atlas Monthly Climatology\n",
    "    params.append(Param('tblWOA_Climatology', 'nitrate_WOA_clim', 0, .5, .5, 5))\n",
    "    params.append(Param('tblWOA_Climatology', 'phosphate_WOA_clim', 0, 0.5, 0.5, 5))\n",
    "    ####### Satellite\n",
    "    params.append(Param('tblCHL_REP', 'chl', 4, 0.25, 0.25, 0))\n",
    "    ####### Darwin Model\n",
    "    params.append(Param('tblDarwin_Phytoplankton', 'picoprokaryote', 2, 0.25, 0.25, 5))\n",
    "\n",
    "    \n",
    "    tables, variables, temporalTolerance, latTolerance, lonTolerance, depthTolerance = [], [], [], [], [], []\n",
    "    for i in range(len(params)):\n",
    "        tables.append(params[i].table)\n",
    "        variables.append(params[i].variable)\n",
    "        temporalTolerance.append(params[i].temporalTolerance)\n",
    "        latTolerance.append(params[i].latTolerance)\n",
    "        lonTolerance.append(params[i].lonTolerance)\n",
    "        depthTolerance.append(params[i].depthTolerance)\n",
    "    return tables, variables, temporalTolerance, latTolerance, lonTolerance, depthTolerance\n",
    "\n",
    "\n",
    "\n",
    "targetTables, targetVars, temporalTolerance, latTolerance, lonTolerance, depthTolerance = match_params()\n",
    "go = plot_corr_map(\n",
    "                  sourceTable='tblAMT13_Chisholm', \n",
    "                  sourceVar='MIT9313PCR_Chisholm',\n",
    "                  targetTables=targetTables,\n",
    "                  targetVars=targetVars,\n",
    "                  dt1='2003-09-14', \n",
    "                  dt2='2003-10-13', \n",
    "                  lat1=-48, \n",
    "                  lat2=48, \n",
    "                  lon1=-52, \n",
    "                  lon2=-11, \n",
    "                  depth1=0, \n",
    "                  depth2=240, \n",
    "                  temporalTolerance=temporalTolerance,\n",
    "                  latTolerance=latTolerance,\n",
    "                  lonTolerance=lonTolerance,\n",
    "                  depthTolerance=depthTolerance\n",
    "                  )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# here is how to modify the graph:\n",
    "import numpy as np\n",
    "\n",
    "# print correlation values\n",
    "# print(go.z)\n",
    "# print(go.x)\n",
    "# print(go.y)\n",
    "go.z = np.abs(go.z)\n",
    "go.cmap = 'Greys'\n",
    "go.width = 1000\n",
    "go.height = 1000\n",
    "go.render()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}