{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Building Tilesets using Datashader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ":::{warning}\n", "This notebook is a work in progress, and the tiling functionality may not be fully implemented yet. \n", "If you'd like to contribute or report missing features, feel free to [open an issue](https://github.com/holoviz/datashader/issues) in the Datashader repository.\n", ":::" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Datashader provides `render_tiles` which is a utility function for creating tilesets from arbitrary datashader pipelines." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datashader.tiles import render_tiles" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A couple of notes about the tiling process:\n", " \n", "- By default, uses a simple `Web Mercator Tiling Scheme (EPSG:3857)`\n", "- call `render_tiles` with the following arguments:\n", "\n", "```python\n", "extent_of_area_i_want_to_tile = (-500000, -500000, 500000, 500000) # xmin, ymin, xmax, ymax\n", "render_tiles(extent_of_data_i_want_to_handle,\n", " tile_levels=range(6),\n", " output_path='example_tileset_output_directory',\n", " load_data_func=function_which_returns_dataframe,\n", " rasterize_func=function_which_creates_xarray_aggregate,\n", " shader_func=function_which_renders_aggregate_to_datashader_image,\n", " post_render_func=function_which_post_processes_image)\n", "```\n", "\n", "- data representing x / y coordinates is assumed to be represented in meters (m) based on the Web Mercator coordinate system.\n", "- the tiling extent is subdivided into `supertiles` generally of size `4096 x 4096`\n", "- the `load_data_func` returns a dataframe-like object and contains your data access specific code.\n", "- the `rasterize_func` returns a `xr.DataArray` and contains your xarray specific code.\n", "- the `shader_func` returns a `ds.Image` and contains your datashader specific code.\n", "- the `post_render_func` is called once for each final tile (`default 256 x 256`) and contains PIL (Python Imaging Library) specific code. This is the hook for adding additional filters, text, watermarks, etc. to output tiles." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating Tile Component Functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create `load_data_func`\n", "- accepts `x_range` and `y_range` arguments which correspond to the ranges of the supertile being rendered.\n", "- returns a dataframe-like object (pd.Dataframe / dask.Dataframe)\n", "- this example `load_data_func` creates a pandas dataframe with `x` and `y` fields sampled from a wald distribution " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "df = None\n", "def load_data_func(x_range, y_range):\n", " global df\n", " if df is None:\n", " xoffsets = [-1, 1, -1, 1]\n", " yoffsets = [-1, 1, 1, -1]\n", " xs = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in xoffsets])\n", " ys = np.concatenate([np.random.wald(10000000, 10000000, size=10000000) * offset for offset in yoffsets])\n", " df = pd.DataFrame(dict(x=xs, y=ys))\n", "\n", " return df.loc[df['x'].between(*x_range) & df['y'].between(*y_range)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create `rasterize_func`\n", "- accepts `df`, `x_range`, `y_range`, `height`, `width` arguments which correspond to the data, ranges, and plot dimensions of the supertile being rendered.\n", "- returns an `xr.DataArray` object representing the aggregate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datashader as ds\n", "\n", "def rasterize_func(df, x_range, y_range, height, width):\n", " # aggregate\n", " cvs = ds.Canvas(x_range=x_range, y_range=y_range,\n", " plot_height=height, plot_width=width)\n", " agg = cvs.points(df, 'x', 'y')\n", " return agg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create `shader_func`\n", "- accepts `agg (xr.DataArray)`, `span (tuple(min, max))`. The span argument can be used to control color mapping / auto-ranging across supertiles.\n", "- returns an `ds.Image` object representing the shaded image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import datashader.transfer_functions as tf\n", "from datashader.colors import viridis\n", "\n", "def shader_func(agg, span=None):\n", " img = tf.shade(agg, cmap=reversed(viridis), span=span, how='log')\n", " img = tf.set_background(img, 'black')\n", " return img" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create `post_render_func`\n", "- accepts `img `, `extras` arguments which correspond to the output PIL.Image before it is write to disk (or S3), and additional image properties.\n", "- returns image `(PIL.Image)`\n", "- this is a good place to run any non-datashader-specific logic on each output tile." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from PIL import ImageDraw\n", "\n", "def post_render_func(img, **kwargs):\n", " info = \"x={},y={},z={}\".format(kwargs['x'], kwargs['y'], kwargs['z'])\n", " draw = ImageDraw.Draw(img)\n", " draw.text((5, 5), info, fill='rgb(255, 255, 255)')\n", " return img" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Render tiles to local filesystem" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "full_extent_of_data = (-500000, -500000, 500000, 500000)\n", "output_path = 'tiles_output_directory/wald_tiles'\n", "results = render_tiles(full_extent_of_data,\n", " range(3),\n", " load_data_func=load_data_func,\n", " rasterize_func=rasterize_func,\n", " shader_func=shader_func,\n", " post_render_func=post_render_func,\n", " output_path=output_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preview the tileset using Bokeh\n", "- Browse to the tile output directory and start an http server:\n", "\n", "```bash\n", "$> cd test_tiles_output\n", "$> python -m http.server\n", "\n", "Starting up http-server, serving ./\n", "Available on:\n", " http://127.0.0.1:8080\n", " http://192.168.1.7:8080\n", "Hit CTRL-C to stop the server\n", "```\n", "\n", "- build a `bokeh.plotting.Figure`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from bokeh.plotting import figure\n", "from bokeh.models.tiles import WMTSTileSource\n", "from bokeh.io import show\n", "from bokeh.io import output_notebook\n", "\n", "output_notebook()\n", "\n", "xmin, ymin, xmax, ymax = full_extent_of_data\n", "\n", "p = figure(width=800, height=800,\n", " x_range=(int(-20e6), int(20e6)),\n", " y_range=(int(-20e6), int(20e6)),\n", " tools=\"pan,wheel_zoom,reset\")\n", "\n", "p.background_fill_color = 'black'\n", "p.grid.grid_line_alpha = 0\n", "p.axis.visible = False\n", "p.add_tile(WMTSTileSource(url=\"http://localhost:8080/{Z}/{X}/{Y}.png\"),\n", " render_parents=False)\n", "show(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Render tiles to Amazon Simple Storage Service (S3)\n", "\n", "To render tiles directly to S3, you only need to use the `s3://` protocol in your `output_path` argument\n", "\n", "- Requires AWS Access / Secret Keys with appropriate IAM permissions for uploading to S3.\n", "- Requires extra `boto3` dependency:\n", "```bash\n", "conda install boto3\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configuring credentials\n", "\n", "- Quoting [`boto3 documentation regarding credential handling`](https://boto3.readthedocs.io/en/latest/guide/configuration.html):\n", "\n", "> The mechanism in which boto3 looks for credentials is to search through a list of possible locations and stop as soon as it finds credentials. The order in which Boto3 searches for credentials is:\n", "1. ~~Passing credentials as parameters in the boto.client() method~~\n", "- ~~Passing credentials as parameters when creating a Session object~~\n", "- **Environment variables**\n", "- **Shared credential file (~/.aws/credentials)**\n", "- **AWS config file (~/.aws/config)**\n", "- **Assume Role provider**\n", "- **Boto2 config file (/etc/boto.cfg and ~/.boto)**\n", "- **Instance metadata service on an Amazon EC2 instance that has an IAM role configured**.\n", "\n", "- Datashader's `render_tiles` function supports only credential search locations highlighted in bold above\n", "- **NOTE**: all tiles written to S3 are marked with `public-read` ACL settings." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Setup tile bucket using AWS CLI\n", "\n", "```bash\n", "$> aws s3 mb s3://datashader-tiles-testing/\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "full_extent_of_data = (int(-20e6), int(-20e6), int(20e6), int(20e6))\n", "output_path = 's3://datashader-tiles-testing/wald_tiles/'\n", "try:\n", " results = render_tiles(full_extent_of_data,\n", " range(3),\n", " load_data_func=load_data_func,\n", " rasterize_func=rasterize_func,\n", " shader_func=shader_func,\n", " post_render_func=post_render_func,\n", " output_path=output_path)\n", "except ImportError:\n", " print('you must install boto3 to save tiles to Amazon S3')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preview S3 Tiles" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xmin, ymin, xmax, ymax = full_extent_of_data\n", "\n", "p = figure(width=800, height=800,\n", " x_range=(int(-20e6), int(20e6)),\n", " y_range=(int(-20e6), int(20e6)),\n", " tools=\"pan,wheel_zoom,reset\")\n", "p.axis.visible = False\n", "p.background_fill_color = 'black'\n", "p.grid.grid_line_alpha = 0\n", "p.add_tile(WMTSTileSource(url=\"https://datashader-tiles-testing.s3.amazonaws.com/wald_tiles/{Z}/{X}/{Y}.png\"),\n", " render_parents=False)\n", "show(p)" ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 2 }