{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"contentcontainer med left\" style=\"margin-left: -50px;\">\n",
    "<dl class=\"dl-horizontal\">\n",
    "  <dt>Title</dt> <dd> Scatter Element</dd>\n",
    "  <dt>Dependencies</dt> <dd>Bokeh</dd>\n",
    "  <dt>Backends</dt>\n",
    "    <dd><a href='./Scatter.ipynb'>Bokeh</a></dd>\n",
    "    <dd><a href='../matplotlib/Scatter.ipynb'>Matplotlib</a></dd>\n",
    "    <dd><a href='../plotly/Scatter.ipynb'>Plotly</a></dd>\n",
    "</dl>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "import holoviews as hv\n",
    "from holoviews import dim\n",
    "\n",
    "hv.extension('bokeh')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ``Scatter`` element visualizes as markers placed in a space of one independent variable, traditionally denoted as *x*, against a dependent variable, traditionally denoted as *y*. In HoloViews, the name ``'x'`` is the default dimension name used in the key dimensions (``kdims``) and ``'y'`` is the default dimension name used in the value dimensions (``vdims``). We can see this from the default axis labels when visualizing a simple ``Scatter`` element:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "coords = [(i, np.random.random()) for i in range(20)]\n",
    "scatter = hv.Scatter(coords)\n",
    "\n",
    "scatter.opts(color='k', marker='s', size=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here the random *y* values are considered to be the 'data' whereas the *x* positions express where those data values were measured (compare this to the different way that [``Points``](./Points.ipynb) elements are defined). In this sense, ``Scatter`` is equivalent to a [``Curve``](./Curve.ipynb) without any lines connecting the samples, and you can use slicing to view the *y* values corresponding to a chosen *x* range:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "scatter[0:12] + scatter[12:20]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A ``Scatter`` element must always have at least one value dimension (to give it a *y* location), but additional value dimensions are also supported. Here is an example with two additional quantities for each point, declared as the ``vdims`` ``'z'`` and ``'size'`` visualized as the color and size of the dots, respectively:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(10)\n",
    "data = np.random.rand(100,4)\n",
    "\n",
    "scatter = hv.Scatter(data, vdims=['y', 'z', 'size'])\n",
    "scatter = scatter.opts(color='z', size=dim('size')*20)\n",
    "scatter + scatter[0.3:0.7, 0.3:0.7].hist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the right subplot, the ``hist`` method is used to show the distribution of samples along our first value dimension, (``'y'``)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The marker shape specified above can be any supported by [matplotlib](https://matplotlib.org/api/markers_api.html), e.g. ``s``, ``d``, or ``o``; the other options select the color and size of the marker.  For convenience with the [bokeh backend](../../../user_guide/Plotting_with_Bokeh.ipynb), the matplotlib marker options are supported using a compatibility function in HoloViews."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note**: Although the  ``Scatter`` element is superficially similar to the [``Points``](./Points.ipynb) element (they can generate plots that look identical), the two element types are semantically quite different: Unlike ``Scatter``, ``Points`` are used to visualize data where the *y* variable is *independent*. This semantic difference also explains why the histogram generated by the ``hist`` call above visualizes the distribution of a different dimension than it does for [``Points``](./Points.ipynb) (because here *y*, not *z*, is the first ``vdim``).\n",
    "\n",
    "This difference means that ``Scatter`` elements can most naturally overlay with other elements that express dependent relationships between the *x* and *y* axes in two-dimensional space, such as the ``Chart`` types like [``Curve``](./Curve.ipynb). Conversely, ``Points`` elements either capture *(x,y)* spatial locations or they express a dependent relationship between an *(x,y)* location and some other dimension (expressed as point size, color, etc.), and thus they can most naturally overlay with [``Raster``](./Raster.ipynb) types like [``Image``](./Image.ipynb).\n",
    "\n",
    "For full documentation and the available style and plot options, use ``hv.help(hv.Scatter).``"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}