{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Spatial relationships and operations" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import pandas as pd\n", "import geopandas\n", "\n", "pd.options.display.max_rows = 10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries = geopandas.read_file(\"zip://./data/ne_110m_admin_0_countries.zip\")\n", "cities = geopandas.read_file(\"zip://./data/ne_110m_populated_places.zip\")\n", "rivers = geopandas.read_file(\"zip://./data/ne_50m_rivers_lake_centerlines.zip\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Spatial relationships\n", "\n", "An important aspect of geospatial data is that we can look at *spatial relationships*: how two spatial objects relate to each other (whether they overlap, intersect, contain, .. one another).\n", "\n", "The topological, set-theoretic relationships in GIS are typically based on the DE-9IM model. See https://en.wikipedia.org/wiki/Spatial_relation for more information.\n", "\n", "![](img/TopologicSpatialRelarions2.png)\n", "(Image by [Krauss, CC BY-SA 3.0](https://en.wikipedia.org/wiki/Spatial_relation#/media/File:TopologicSpatialRelarions2.png))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Relationships between individual objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's first create some small toy spatial objects:\n", "\n", "A polygon (note: we use `.squeeze()` here to to extract the scalar geometry object from the GeoSeries of length 1):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "belgium = countries.loc[countries['name'] == 'Belgium', 'geometry'].squeeze()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Two points:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "paris = cities.loc[cities['name'] == 'Paris', 'geometry'].squeeze()\n", "brussels = cities.loc[cities['name'] == 'Brussels', 'geometry'].squeeze()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And a linestring:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from shapely.geometry import LineString\n", "line = LineString([paris, brussels])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's visualize those 4 geometry objects together (I only put them in a GeoSeries to easily display them together with the geopandas `.plot()` method):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "geopandas.GeoSeries([belgium, paris, brussels, line]).plot(cmap='tab10')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can recognize the abstract shape of Belgium.\n", "\n", "Brussels, the capital of Belgium, is thus located within Belgium. This is a spatial relationship, and we can test this using the individual shapely geometry objects as follow:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "brussels.within(belgium)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And using the reverse, Belgium contains Brussels:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "belgium.contains(brussels)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On the other hand, Paris is not located in Belgium:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "belgium.contains(paris)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "paris.within(belgium)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The straight line we draw from Paris to Brussels is not fully located within Belgium, but it does intersect with it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "belgium.contains(line)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "line.intersects(belgium)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Spatial relationships with GeoDataFrames\n", "\n", "The same methods that are available on individual `shapely` geometries as we have seen above, are also available as methods on `GeoSeries` / `GeoDataFrame` objects.\n", "\n", "For example, if we call the `contains` method on the world dataset with the `paris` point, it will do this spatial check for each country in the `world` dataframe:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries.contains(paris)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because the above gives us a boolean result, we can use that to filter the dataframe:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries[countries.contains(paris)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And indeed, France is the only country in the world in which Paris is located." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another example, extracting the linestring of the Amazon river in South America, we can query through which countries the river flows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "amazon = rivers[rivers['name'] == 'Amazonas'].geometry.squeeze()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries[countries.crosses(amazon)] # or .intersects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " | \n", " |
\n", " | \n", " |
\n", " | \n", " |
\n", " | \n", " |
\n", " Below, the coordinates for the Seine river in the neighbourhood of Paris are provided as a GeoJSON-like feature dictionary (created at http://geojson.io). \n", "
\n", " \n", "\n", " Based on this `seine` object, we want to know which districts are located close (maximum 150 m) to the Seine. \n", "
\n", " \n", " \n", "\n", "