{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Spatial relationships and operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "import pandas as pd\n",
    "import geopandas\n",
    "\n",
    "pd.options.display.max_rows = 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "countries = geopandas.read_file(\"zip://./data/ne_110m_admin_0_countries.zip\")\n",
    "cities = geopandas.read_file(\"zip://./data/ne_110m_populated_places.zip\")\n",
    "rivers = geopandas.read_file(\"zip://./data/ne_50m_rivers_lake_centerlines.zip\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Spatial relationships\n",
    "\n",
    "An important aspect of geospatial data is that we can look at *spatial relationships*: how two spatial objects relate to each other (whether they overlap, intersect, contain, .. one another).\n",
    "\n",
    "The topological, set-theoretic relationships in GIS are typically based on the DE-9IM model. See https://en.wikipedia.org/wiki/Spatial_relation for more information.\n",
    "\n",
    "![](img/TopologicSpatialRelarions2.png)\n",
    "(Image by [Krauss, CC BY-SA 3.0](https://en.wikipedia.org/wiki/Spatial_relation#/media/File:TopologicSpatialRelarions2.png))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Relationships between individual objects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's first create some small toy spatial objects:\n",
    "\n",
    "A polygon <small>(note: we use `.squeeze()` here to to extract the scalar geometry object from the GeoSeries of length 1)</small>:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "belgium = countries.loc[countries['name'] == 'Belgium', 'geometry'].squeeze()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Two points:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "paris = cities.loc[cities['name'] == 'Paris', 'geometry'].squeeze()\n",
    "brussels = cities.loc[cities['name'] == 'Brussels', 'geometry'].squeeze()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And a linestring:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from shapely.geometry import LineString\n",
    "line = LineString([paris, brussels])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's visualize those 4 geometry objects together (I only put them in a GeoSeries to easily display them together with the geopandas `.plot()` method):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "geopandas.GeoSeries([belgium, paris, brussels, line]).plot(cmap='tab10')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can recognize the abstract shape of Belgium.\n",
    "\n",
    "Brussels, the capital of Belgium, is thus located within Belgium. This is a spatial relationship, and we can test this using the individual shapely geometry objects as follow:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "brussels.within(belgium)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And using the reverse, Belgium contains Brussels:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "belgium.contains(brussels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On the other hand, Paris is not located in Belgium:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "belgium.contains(paris)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "paris.within(belgium)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The straight line we draw from Paris to Brussels is not fully located within Belgium, but it does intersect with it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "belgium.contains(line)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "line.intersects(belgium)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Spatial relationships with GeoDataFrames\n",
    "\n",
    "The same methods that are available on individual `shapely` geometries as we have seen above, are also available as methods on `GeoSeries` / `GeoDataFrame` objects.\n",
    "\n",
    "For example, if we call the `contains` method on the world dataset with the `paris` point, it will do this spatial check for each country in the `world` dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "countries.contains(paris)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because the above gives us a boolean result, we can use that to filter the dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "countries[countries.contains(paris)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And indeed, France is the only country in the world in which Paris is located."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another example, extracting the linestring of the Amazon river in South America, we can query through which countries the river flows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "amazon = rivers[rivers['name'] == 'Amazonas'].geometry.squeeze()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "countries[countries.crosses(amazon)]  # or .intersects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\" style=\"font-size:120%\">\n",
    "<b>REFERENCE</b>: <br><br>\n",
    "\n",
    "Overview of the different functions to check spatial relationships (*spatial predicate functions*):\n",
    "\n",
    "* `equals`\n",
    "* `contains`\n",
    "* `crosses`\n",
    "* `disjoint`\n",
    "* `intersects`\n",
    "* `overlaps`\n",
    "* `touches`\n",
    "* `within`\n",
    "* `covers`\n",
    "\n",
    "\n",
    "See https://shapely.readthedocs.io/en/stable/manual.html#predicates-and-relationships for an overview of those methods.\n",
    "\n",
    "See https://en.wikipedia.org/wiki/DE-9IM for all details on the semantics of those operations.\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Let's practice!\n",
    "\n",
    "We will again use the Paris datasets to do some exercises. Let's start importing them again:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "districts = geopandas.read_file(\"data/paris_districts_utm.geojson\")\n",
    "stations = geopandas.read_file(\"data/paris_sharing_bike_stations_utm.geojson\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-success\">\n",
    " <b>EXERCISE</b>:\n",
    "\n",
    "\n",
    "* Create a shapely `Point` object for the Notre Dame cathedral (which has x/y coordinates of (452321.4581477511, 5411311.330882619))\n",
    "* Calculate the distance of each bike station to the Notre Dame.\n",
    "* Check in which district the Notre Dame is located.\n",
    " \n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from shapely.geometry import Point"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations1.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations2.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations3.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations4.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Spatial operations\n",
    "\n",
    "Next to the spatial predicates that return boolean values, Shapely and GeoPandas also provide operations that return new geometric objects.\n",
    "\n",
    "**Binary operations:**\n",
    "\n",
    "<table><tr>\n",
    "<td> <img src=\"img/spatial-operations-base.png\"/> </td>\n",
    "<td> <img src=\"img/spatial-operations-intersection.png\"/> </td>\n",
    "</tr>\n",
    "<tr>\n",
    "<td> <img src=\"img/spatial-operations-union.png\"/> </td>\n",
    "<td> <img src=\"img/spatial-operations-difference.png\"/> </td>\n",
    "</tr></table>\n",
    "\n",
    "**Buffer:**\n",
    "\n",
    "<table><tr>\n",
    "<td> <img src=\"img/spatial-operations-buffer-point1.png\"/> </td>\n",
    "<td> <img src=\"img/spatial-operations-buffer-point2.png\"/> </td>\n",
    "</tr>\n",
    "<tr>\n",
    "<td> <img src=\"img/spatial-operations-buffer-line.png\"/> </td>\n",
    "<td> <img src=\"img/spatial-operations-buffer-polygon.png\"/> </td>\n",
    "</tr></table>\n",
    "\n",
    "\n",
    "See https://shapely.readthedocs.io/en/stable/manual.html#spatial-analysis-methods for more details."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For example, using the toy data from above, let's construct a buffer around Brussels (which returns a Polygon):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "geopandas.GeoSeries([belgium, brussels.buffer(1)]).plot(alpha=0.5, cmap='tab10')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and now take the intersection, union or difference of those two polygons:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "brussels.buffer(1).intersection(belgium)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "brussels.buffer(1).union(belgium)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "brussels.buffer(1).difference(belgium)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another useful method is the `unary_union` attribute, which converts the set of geometry objects in a GeoDataFrame into a single geometry object by taking the union of all those geometries.\n",
    "\n",
    "For example, we can construct a single object for the Africa continent:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "africa_countries = countries[countries['continent'] == 'Africa']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "africa = africa_countries.unary_union"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "africa"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(str(africa)[:1000])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\" style=\"font-size:120%\">\n",
    "<b>REMEMBER</b>: <br><br>\n",
    "\n",
    "GeoPandas (and Shapely for the individual objects) provides a whole lot of basic methods to analyse the geospatial data (distance, length, centroid, boundary, convex_hull, simplify, transform, ....), much more than the few that we can touch in this tutorial.\n",
    "\n",
    "\n",
    "* An overview of all methods provided by GeoPandas can be found here: http://geopandas.readthedocs.io/en/latest/reference.html\n",
    "\n",
    "\n",
    "</div>\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Let's practice!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-success\">\n",
    " <b>EXERCISE: What are the districts close to the Seine?</b>\n",
    " \n",
    " <p>\n",
    " Below, the coordinates for the Seine river in the neighbourhood of Paris are provided as a GeoJSON-like feature dictionary (created at http://geojson.io). \n",
    " </p>\n",
    " \n",
    "  <p>\n",
    " Based on this `seine` object, we want to know which districts are located close (maximum 150 m) to the Seine. \n",
    " </p>\n",
    " \n",
    " \n",
    " <p>\n",
    " <ul>\n",
    "  <li>Create a buffer of 150 m around the Seine.</li>\n",
    "  <li>Check which districts intersect with this buffered object.</li>\n",
    "  <li>Make a visualization of the districts indicating which districts are located close to the Seine.</li>\n",
    " </ul> \n",
    " </p>\n",
    " \n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# created a line with http://geojson.io\n",
    "s_seine = geopandas.GeoDataFrame.from_features({\"type\":\"FeatureCollection\",\"features\":[{\"type\":\"Feature\",\"properties\":{},\"geometry\":{\"type\":\"LineString\",\"coordinates\":[[2.408924102783203,48.805619828930226],[2.4092674255371094,48.81703747481909],[2.3927879333496094,48.82325391133874],[2.360687255859375,48.84912860497674],[2.338714599609375,48.85827758964043],[2.318115234375,48.8641501307046],[2.298717498779297,48.863246707697],[2.2913360595703125,48.859519915404825],[2.2594070434570312,48.8311646245967],[2.2436141967773438,48.82325391133874],[2.236919403076172,48.82347994904826],[2.227306365966797,48.828339513221444],[2.2224998474121094,48.83862215329593],[2.2254180908203125,48.84856379804802],[2.2240447998046875,48.85409863123821],[2.230224609375,48.867989496547864],[2.260265350341797,48.89192242750887],[2.300262451171875,48.910203080780285]]}}]},\n",
    "                                               crs={'init': 'epsg:4326'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# convert to local UTM zone\n",
    "s_seine_utm = s_seine.to_crs(epsg=32631)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "fig, ax = plt.subplots(figsize=(20, 10))\n",
    "districts.plot(ax=ax, color='grey', alpha=0.4, edgecolor='k')\n",
    "s_seine_utm.plot(ax=ax)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# access the single geometry object\n",
    "seine = s_seine_utm.geometry.squeeze()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations5.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations6.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations7.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "clear_cell": true
   },
   "outputs": [],
   "source": [
    "# %load _solved/solutions/02-spatial-relationships-operations8.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}