{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with large data using Datashader\n", "\n", "The various plotting-library backends supported by HoloViews, such as Matplotlib, Bokeh, and Plotly, each have limitations on the amount of data that is practical to work with. Bokeh and Plotly in particular mirror your data directly into an HTML page viewable in your browser, which can cause problems when data sizes approach the limited memory available for each web page in current browsers.\n", "\n", "Luckily, a visualization of even the largest dataset will be constrained by the resolution of your display device, and so one approach to handling such data is to pre-render or rasterize the data into a fixed-size array or image *before* sending it to the backend plotting library and thus to your local web browser. The [Datashader](https://github.com/bokeh/datashader) library provides a high-performance big-data server-side rasterization pipeline that works seamlessly with HoloViews to support datasets that are orders of magnitude larger than those supported natively by the plotting-library backends, including millions or billions of points even on ordinary laptops.\n", "\n", "Here, we will see how and when to use Datashader with HoloViews Elements and Containers. For simplicity in this discussion we'll focus on simple synthetic datasets, but [Datashader's examples](http://datashader.org/topics) include a wide variety of real datasets that give a much better idea of the power of using Datashader with HoloViews, and [HoloViz.org](http://holoviz.org) shows how to install and work with HoloViews and Datashader together.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datashader as ds\n", "import numpy as np\n", "import holoviews as hv\n", "\n", "from holoviews import opts\n", "from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize\n", "from holoviews.operation import decimate\n", "\n", "hv.extension('bokeh','matplotlib')\n", "\n", "decimate.max_samples=1000\n", "dynspread.max_px=20\n", "dynspread.threshold=0.5\n", "\n", "def random_walk(n, f=5000):\n", " \"\"\"Random walk in a 2D space, smoothed with a filter of length f\"\"\"\n", " xs = np.convolve(np.random.normal(0, 0.1, size=n), np.ones(f)/f).cumsum()\n", " ys = np.convolve(np.random.normal(0, 0.1, size=n), np.ones(f)/f).cumsum()\n", " xs += 0.1*np.sin(0.1*np.array(range(n-1+f))) # add wobble on x axis\n", " xs += np.random.normal(0, 0.005, size=n-1+f) # add measurement noise\n", " ys += np.random.normal(0, 0.005, size=n-1+f)\n", " return np.column_stack([xs, ys])\n", "\n", "def random_cov():\n", " \"\"\"Random covariance for use in generating 2D Gaussian distributions\"\"\"\n", " A = np.random.randn(2,2)\n", " return np.dot(A, A.T)\n", "\n", "def time_series(T = 1, N = 100, mu = 0.1, sigma = 0.1, S0 = 20): \n", " \"\"\"Parameterized noisy time series\"\"\"\n", " dt = float(T)/N\n", " t = np.linspace(0, T, N)\n", " W = np.random.standard_normal(size = N) \n", " W = np.cumsum(W)*np.sqrt(dt) # standard brownian motion\n", " X = (mu-0.5*sigma**2)*t + sigma*W \n", " S = S0*np.exp(X) # geometric brownian motion\n", " return S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "