{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading raster files with Rasterio\n", "\n", "[Rasterio](https://mapbox.github.io/rasterio/) is a highly useful module for raster processing which you can use for reading and writing [several different raster formats](http://www.gdal.org/formats_list.html) in Python. Rasterio is based on [GDAL](http://www.gdal.org/) and Python automatically registers all known GDAL drivers for reading supported\n", "formats when importing the module. Most common file formats include for example [TIFF and GeoTIFF](http://www.gdal.org/frmt_gtiff.html),\n", "[ASCII Grid](http://www.gdal.org/frmt_various.html#AAIGrid) and [Erdas Imagine .img](http://www.gdal.org/frmt_hfa.html) -files.\n", "\n", "[Landsat 8](http://landsat.gsfc.nasa.gov/landsat-8/landsat-8-bands) bands are stored as separate GeoTIFF -files in the original package. Each band contains information of surface reflectance from different ranges of the electromagnetic spectrum.\n", "\n", "### Download data\n", "\n", "**Download the data** package from [here](http://www.helsinki.fi/science/accessibility/opetus/autogis/L5_data.zip). The package contains various TIF-files that will be explored during the tutorials. \n", "\n", "Let's start with inspecting one of the files we downloaded:\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "rasterio.io.DatasetReader" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import rasterio\n", "import os\n", "import numpy as np\n", "%matplotlib inline\n", "\n", "# Data dir\n", "data_dir = \"L5_data\"\n", "fp = os.path.join(data_dir, \"Helsinki_masked_p188r018_7t20020529_z34__LV-FIN.tif\")\n", "\n", "# Open the file:\n", "raster = rasterio.open(fp)\n", "\n", "# Check type of the variable 'raster'\n", "type(raster)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okey so from here we can see that our `raster` variable is a `rasterio._io.RasterReader` type which means that we have opened the file for reading." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Read raster file properties\n", "\n", "Let's have a closer look at the properties of the file:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "CRS({'proj': 'tmerc', 'lat_0': 0, 'lon_0': -183, 'k': 0.9996, 'x_0': 500000, 'y_0': 0, 'datum': 'WGS84', 'units': 'm', 'no_defs': True})" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Projection\n", "raster.crs" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Affine(28.5, 0.0, 698592.0,\n", " 0.0, -28.5, 6697870.5)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Affine transform (how raster is scaled, rotated, skewed, and/or translated)\n", "raster.transform\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1288\n", "1439\n" ] } ], "source": [ "# Dimensions\n", "print(raster.width)\n", "print(raster.height)\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of bands\n", "raster.count\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BoundingBox(left=698592.0, bottom=6656859.0, right=735300.0, top=6697870.5)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Bounds of the file\n", "raster.bounds" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'GTiff'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Driver (data format)\n", "raster.driver" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(None, None, None, None, None, None, None)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# No data values for all channels\n", "raster.nodatavals" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'driver': 'GTiff',\n", " 'dtype': 'uint8',\n", " 'nodata': None,\n", " 'width': 1288,\n", " 'height': 1439,\n", " 'count': 7,\n", " 'crs': CRS({'proj': 'tmerc', 'lat_0': 0, 'lon_0': -183, 'k': 0.9996, 'x_0': 500000, 'y_0': 0, 'datum': 'WGS84', 'units': 'm', 'no_defs': True}),\n", " 'transform': Affine(28.5, 0.0, 698592.0,\n", " 0.0, -28.5, 6697870.5)}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# All Metadata for the whole raster dataset\n", "raster.meta" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get raster bands\n", "\n", "Different bands of a satellite images are often stacked together in one raster dataset. In our case, all seven bands of the Landsat 8 scene are included in our GeoTIFF and the `count` is hence 7.\n", "\n", "In order to have a closer look at the values stored in the band, we will take advantage of the [GDAL Band API](http://gdal.org/python/osgeo.gdal.Band-class.html).\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "uint8\n" ] } ], "source": [ "# Read the raster band as separate variable\n", "band1 = raster.read(1)\n", "\n", "# Check type of the variable 'band'\n", "print(type(band1))\n", "\n", "# Data type of the values\n", "print(band1.dtype)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have the values of the raster band stored in the variable `band1`.\n", "\n", "Data type of the band can be interpreted with the help of GDAL documentation on [Pixel data types](http://www.gdal.org/gdal_8h.html#a22e22ce0a55036a96f652765793fb7a4). Unsigned integer is always equal or greater than zero and signed integer can store also negative values. For example, an unsigned 16-bit integer can store 2^16 (=65,536) values ranging from 0 to 65,535.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Band statistics\n", "\n", "Next, let's have a look at the values that are stored in the band. As the values of the bands are stored as numpy arrays, it is extremely easy to calculate basic statistics by using basic numpy functions." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'min': 0, 'mean': 59.63132232528628, 'median': 61.0, 'max': 255},\n", " {'min': 0, 'mean': 43.13342814842951, 'median': 43.0, 'max': 255},\n", " {'min': 0, 'mean': 36.29418775547201, 'median': 31.0, 'max': 255},\n", " {'min': 0, 'mean': 35.0946303937776, 'median': 13.0, 'max': 255},\n", " {'min': 0, 'mean': 37.63263502518571, 'median': 13.0, 'max': 255},\n", " {'min': 0, 'mean': 105.8221477777442, 'median': 114.0, 'max': 175},\n", " {'min': 0, 'mean': 26.28348760569581, 'median': 14.0, 'max': 255}]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Read all bands\n", "array = raster.read()\n", "\n", "# Calculate statistics for each band\n", "stats = []\n", "for band in array:\n", " stats.append({\n", " 'min': band.min(),\n", " 'mean': band.mean(),\n", " 'median': np.median(band),\n", " 'max': band.max()})\n", "\n", "# Show stats for each channel\n", "stats" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }