{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

Coordinate reference systems

" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import pandas as pd\n", "import geopandas" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries = geopandas.read_file(\"data/ne_110m_admin_0_countries.zip\")\n", "cities = geopandas.read_file(\"data/ne_110m_populated_places.zip\")\n", "rivers = geopandas.read_file(\"data/ne_50m_rivers_lake_centerlines.zip\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Coordinate reference systems\n", "\n", "Up to now, we have used the geometry data with certain coordinates without further wondering what those coordinates mean or how they are expressed.\n", "\n", "> The **Coordinate Reference System (CRS)** relates the coordinates to a specific location on earth.\n", "\n", "For an in-depth explanation, see https://docs.qgis.org/2.8/en/docs/gentle_gis_introduction/coordinate_reference_systems.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Geographic coordinates\n", "\n", "> Degrees of latitude and longitude.\n", ">\n", "> E.g. 48°51′N, 2°17′E\n", "\n", "The most known type of coordinates are geographic coordinates: we define a position on the globe in degrees of latitude and longitude, relative to the equator and the prime meridian. \n", "With this system, we can easily specify any location on earth. It is used widely, for example in GPS. If you inspect the coordinates of a location in Google Maps, you will also see latitude and longitude.\n", "\n", "**Attention!**\n", "\n", "in Python we use (lon, lat) and not (lat, lon)\n", "\n", "- Longitude: [-180, 180]{{1}}\n", "- Latitude: [-90, 90]{{1}}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Projected coordinates\n", "\n", "> `(x, y)` coordinates are usually in meters or feet\n", "\n", "Although the earth is a globe, in practice we usually represent it on a flat surface: think about a physical map, or the figures we have made with Python on our computer screen.\n", "Going from the globe to a flat map is what we call a *projection*.\n", "\n", "![](img/projection.png)\n", "\n", "We project the surface of the earth onto a 2D plane so we can express locations in cartesian x and y coordinates, on a flat surface. In this plane, we then typically work with a length unit such as meters instead of degrees, which makes the analysis more convenient and effective.\n", "\n", "However, there is an important remark: the 3 dimensional earth can never be represented perfectly on a 2 dimensional map, so projections inevitably introduce distortions. To minimize such errors, there are different approaches to project, each with specific advantages and disadvantages.\n", "\n", "Some projection systems will try to preserve the area size of geometries, such as the Albers Equal Area projection. Other projection systems try to preserve angles, such as the Mercator projection, but will see big distortions in the area. Every projection system will always have some distortion of area, angle or distance.\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Projected size vs actual size (Mercator projection)**:\n", "\n", "![](img/mercator_projection_area.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Coordinate Reference Systems in Python / GeoPandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A GeoDataFrame or GeoSeries has a `.crs` attribute which holds (optionally) a description of the coordinate reference system of the geometries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "countries.crs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the `countries` dataframe, it indicates that it uses the EPSG 4326 / WGS84 lon/lat reference system, which is one of the most used for geographic coordinates.\n", "\n", "\n", "It uses coordinates as latitude and longitude in degrees, as can you be seen from the x/y labels on the plot:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "countries.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `.crs` attribute returns a `pyproj.CRS` object. To specify a CRS, we typically use some string representation:\n", "\n", "\n", "- **EPSG code**\n", " \n", " Example: `EPSG:4326` = WGS84 geographic CRS (longitude, latitude)\n", " \n", "- **Well-Know-Text (WKT)** representation\n", "\n", "- In older software and datasets, you might also encounter a \"`proj4` string\" representation:\n", " \n", " Example: `+proj=longlat +datum=WGS84 +no_defs`\n", "\n", " This is however no longer recommended.\n", "\n", "\n", "See eg https://epsg.io/4326\n", "\n", "Under the hood, GeoPandas uses the `pyproj` / `PROJ` libraries to deal with the re-projections.\n", "\n", "For more information, see also http://geopandas.readthedocs.io/en/latest/projections.html." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Transforming to another CRS\n", "\n", "We can convert a GeoDataFrame to another reference system using the `to_crs` function. \n", "\n", "For example, let's convert the countries to the World Mercator projection (http://epsg.io/3395):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# remove Antartica, as the Mercator projection cannot deal with the poles\n", "countries = countries[(countries['name'] != \"Antarctica\")]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "countries_mercator = countries.to_crs(epsg=3395) # or .to_crs(\"EPSG:3395\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "countries_mercator.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the different scale of x and y." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why using a different CRS?\n", "\n", "There are sometimes good reasons you want to change the coordinate references system of your dataset, for example:\n", "\n", "- Different sources with different CRS -> need to convert to the same crs\n", "\n", " ```python\n", " df1 = geopandas.read_file(...)\n", " df2 = geopandas.read_file(...)\n", "\n", " df2 = df2.to_crs(df1.crs)\n", " ```\n", "\n", "- Mapping (distortion of shape and distances)\n", "\n", "- Distance / area based calculations -> ensure you use an appropriate projected coordinate system expressed in a meaningful unit such as meters or feet (not degrees).\n", "\n", "
\n", "\n", "**ATTENTION:**\n", "\n", "All the calculations that happen in GeoPandas and Shapely assume that your data is in a 2D cartesian plane, and thus the result of those calculations will only be correct if your data is properly projected.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Let's practice!\n", "\n", "Again, we will go back to the Paris datasets. Up to now, we provided the datasets in an appropriate projected CRS for the exercises. But the original data were actually using geographic coordinates. In the following exercises, we will start from there.\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Going back to the Paris districts dataset, this is now provided as a GeoJSON file (`\"data/paris_districts.geojson\"`) in geographic coordinates.\n", "\n", "For converting to projected coordinates, we will use the standard projected CRS for France is the RGF93 / Lambert-93 reference system, referenced by the `EPSG:2154` number (in Belgium this would be Lambert 72, EPSG:31370).\n", "\n", "
\n", "\n", "**EXERCISE 1: Projecting a GeoDataFrame**\n", "\n", "* Read the districts datasets (`\"data/paris_districts.geojson\"`) into a GeoDataFrame called `districts`.\n", "* Look at the CRS attribute of the GeoDataFrame. Do you recognize the EPSG number?\n", "* Make a plot of the `districts` dataset.\n", "* Calculate the area of all districts.\n", "* Convert the `districts` to a projected CRS (using the `EPSG:2154` for France). Call the new dataset `districts_RGF93`.\n", "* Make a similar plot of `districts_RGF93`.\n", "* Calculate the area of all districts again with `districts_RGF93` (the result will now be expressed in m²).\n", " \n", " \n", "
Hints\n", "\n", "* The CRS information is stored in the `.crs` attribute of a GeoDataFrame.\n", "* Making a simple plot of a GeoDataFrame can be done with the `.plot()` method.\n", "* Converting to a different CRS can be done with the `.to_crs()` method, and the CRS can be specified as an EPSG number using the `epsg` keyword.\n", "\n", "
\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems1.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems2.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems3.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems4.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems5.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems6.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems7.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems8.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "**EXERCISE 2:**\n", "\n", "In the previous notebook, we did an exercise on plotting the bike stations locations in Paris and adding a background map to it using the `contextily` package.\n", "\n", "Currently, `contextily` assumes that your data is in the Web Mercator projection, the system used by most web tile services. And in that first exercise, we provided the data in the appropriate CRS so you didn't need to care about this aspect.\n", "\n", "However, typically, your data will not come in Web Mercator (`EPSG:3857`) and you will have to align them with web tiles on your own.\n", " \n", "* Read the bike stations datasets (`\"data/paris_bike_stations.geojson\"`) into a GeoDataFrame called `stations`.\n", "* Convert the `stations` dataset to the Web Mercator projection (`EPSG:3857`). Call the result `stations_webmercator`, and inspect the result.\n", "* Make a plot of this projected dataset (specify the marker size to be 5) and add a background map using `contextily`.\n", "\n", " \n", "
Hints\n", "\n", "* Making a simple plot of a GeoDataFrame can be done with the `.plot()` method. This returns a matplotlib axes object.\n", "* The marker size can be specified with the `markersize` keyword if the `.plot()` method.\n", "* To add a background map, use the `contextily.add_basemap()` function. It takes the matplotlib `ax` to which to add a map as the first argument.\n", "\n", "
\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems9.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems10.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": false }, "tags": [ "nbtutor-solution" ] }, "outputs": [], "source": [ "# %load _solved/solutions/02-coordinate-reference-systems11.py" ] } ], "metadata": { "celltoolbar": "Nbtutor - export exercises", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }