{ "cells": [ { "cell_type": "markdown", "id": "f2ccdf6a", "metadata": {}, "source": [ "## Reprojecting and Resampling\n", "\n", "In this tutorial, you'll learn how to reproject [Cloud Optimized GeoTIFFs](https://www.cogeo.org/) (COG) into a different [Coordinate Reference System](https://gdal.org/tutorials/osr_api_tut.html) (CRS). Reprojecting data usually also includes resampling the raster data to a different resolution, which requires combining multiple pixel values into (when downsampling -- or resampling to lower resolution/larger cellsizes) or interpolating new pixels from the existing ones (when upsampling -- or resampling to higher resolution/smaller cellsizes). You will also learn how to use different resampling algorithms to adjust the raster pixel size.\n", "\n", "This tutorial covers several methods for reprojecting. Which one to pick depends primarily on two factors: \n", "\n", "* whether you already know what CRS you are targeting when you load your data\n", "* whether you have [STAC Items](https://github.com/radiantearth/stac-spec/blob/master/overview.md#item-overview) for the data you're working with\n", "\n", "In general, reprojecting and resampling while loading the data is more efficient, and all the remote-sensing datasets in the [Planetary Computer's data catalog](https://planetarycomputer.microsoft.com/catalog) include STAC Items.\n", "\n", "This tutorial covers the following four methods for reprojecting and resampling geodata:\n", "\n", "* [Method 1](#Method-1:-Reproject-and-resample-while-loading-with-stackstac.stack): **Reproject and resample a large number of COGs *while loading*** using [stackstac](https://stackstac.readthedocs.io/en/latest/). This method requires STAC items.\n", "* [Method 2](#Method-2:-Reproject-and-resample-while-loading-with-rasterio-and-rioxarray-using-a-WarpedVRT): **Reproject and resample a single COG *while loading*** with [rioxarray](https://corteva.github.io/rioxarray/stable/) and [rasterio](https://rasterio.readthedocs.io/en/latest/index.html) using a [WarpedVRT](https://rasterio.readthedocs.io/en/latest/topics/virtual-warping.html). This option works without STAC items but won't scale to a large number of COGs.\n", "* [Method 3](#Method-3:-Reproject-and-resample-an-in-memory-array-with-stackstac.reproject_array): **Reproject and resample an *in-memory* (potentially distributed) DataArray** with [`stackstac.reproject_array`](https://stackstac.readthedocs.io/en/latest/api/main/stackstac.reproject_array.html). This option requires more memory and is generally less efficient than option 1, but you don't need to know the destination CRS when loading your data.\n", "* [Method 4](#Method-4:-Reproject-and-resample-an-in-memory-DataArray-with-rasterio-and-rioxarray): **Reproject and resample an *in-memory* (single machine) DataArray** using [rioxarray](https://corteva.github.io/rioxarray/stable/) and [rasterio](https://rasterio.readthedocs.io/en/latest/index.html). This option only works on a single machine and requires more memory than option 2, but you don't need to know the destination CRS when loading your data.\n", "\n", "These reprojection methods are ranked in order of what is generally recommended. However, certain methods may not be available depending on whether you know your destination CRS while loading the data and whether you have STAC Items." ] }, { "cell_type": "code", "execution_count": 1, "id": "e1d8423b", "metadata": {}, "outputs": [], "source": [ "import planetary_computer\n", "import pystac_client\n", "\n", "import numpy as np\n", "import xarray as xr\n", "\n", "import stackstac\n", "import affine\n", "import pyproj\n", "import rioxarray\n", "import rasterio\n", "\n", "from rioxarray.rioxarray import _make_coords\n", "from rasterio.vrt import WarpedVRT\n", "\n", "import xrspatial.multispectral as ms\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "199e7425", "metadata": {}, "source": [ "### Preparation: Create a local Dask cluster\n", "\n", "In this tutorial, you'll be using a small dataset. Create a local Dask cluster to process the data in parallel using all the cores of your machine." ] }, { "cell_type": "code", "execution_count": 2, "id": "e9b221c9-89c7-4c69-b008-43ea0eea1241", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2022-08-16 16:36:43,441 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/dask-worker-space/worker-iajxxwmr', purging\n", "2022-08-16 16:36:43,442 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/dask-worker-space/worker-0_6rl8kn', purging\n", "2022-08-16 16:36:43,442 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/dask-worker-space/worker-rml0ni71', purging\n", "2022-08-16 16:36:43,442 - distributed.diskutils - INFO - Found stale lock file and directory '/tmp/dask-worker-space/worker-9lcurr39', purging\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "/proxy/8787/status\n" ] } ], "source": [ "from dask.distributed import Client\n", "\n", "client = Client()\n", "print(f\"/proxy/{client.scheduler_info()['services']['dashboard']}/status\")" ] }, { "cell_type": "markdown", "id": "9659b7b2", "metadata": {}, "source": [ "To follow the progress of your computation, you can [access the Dask Dashboard](https://planetarycomputer.microsoft.com/docs/quickstarts/scale-with-dask/#Open-the-dashboard) at the URL from the previous cell's output.\n", "\n", "The region of interest for this tutorial is located in Colorado, USA, centered by Cochetopa Dome and Sawtooth Mountain. You'll be using [Sentinel-2 Level-2A](https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a) data. See [Reading data from the STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) for more information on accessing data through the STAC API." ] }, { "cell_type": "code", "execution_count": 3, "id": "47dc1478-3220-4b28-9418-225f21ea03b3", "metadata": {}, "outputs": [], "source": [ "stac = pystac_client.Client.open(\n", " \"https://planetarycomputer.microsoft.com/api/stac/v1/\",\n", " modifier=planetary_computer.sign_inplace,\n", ")\n", "search = stac.search(\n", " collections=[\"sentinel-2-l2a\"],\n", " intersects={\"type\": \"Point\", \"coordinates\": [-106.714900, 38.22679]},\n", " query={\"eo:cloud_cover\": {\"lt\": 1}},\n", " datetime=\"2019-09-01/2019-09-30\",\n", ")\n", "\n", "item = next(search.items())" ] }, { "cell_type": "markdown", "id": "98739cec", "metadata": {}, "source": [ "### Method 1: Reproject and resample while loading with [stackstac.stack](https://stackstac.readthedocs.io/en/latest/api/main/stackstac.stack.html)\n", "\n", "The selected scene is stored in an `EPSG:32613` coordinate system. Let's reproject it to a new CRS by providing the destination `epsg` number to the [stackstac.stack](https://stackstac.readthedocs.io/en/latest/api/main/stackstac.stack.html) function. To adjust the data's resolution, also provide values to the function's optional `resolution` and `resampling` parameters.\n", "\n", "With `stackstac.stack`, you can use any CRS as defined by an [EPSG code](http://epsg.io/). In this example, you use the `stackstac.stack` function to reproject to the Lambert Cylindrical projection (`epsg:6933`). [Lambert Cylindrical](https://en.wikipedia.org/wiki/Lambert_cylindrical_equal-area_projection) is an equal-area projection that preserves area measurements. Regions with the same size on the Earth's surface have the same size on the map. However, shapes, angles, and scales are generally distorted.\n", "\n", "This example uses the [bilinear](https://rasterio.readthedocs.io/en/latest/api/rasterio.enums.html#rasterio.enums.Resampling.bilinear) resampling method. Resampling happens in the `stackstac.stack` function as well. As this function uses [rasterio](https://rasterio.readthedocs.io/en/latest/index.html) internally, you can use any algorithm from the [rasterio.enums.Resampling](https://rasterio.readthedocs.io/en/latest/api/rasterio.enums.html#rasterio.enums.Resampling) class. To use a specific algorithm, pass it as a value for the `resampling` parameter. The default resampling method in `stackstac.stack` is `rasterio.enums.Resampling.nearest`." ] }, { "cell_type": "code", "execution_count": 4, "id": "56318085", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray 'stackstac-22268e1e94fdf2bbacd6eb69e3399611' (band: 3,\n", " y: 1013, x: 1235)>\n", "dask.array<getitem, shape=(3, 1013, 1235), dtype=float64, chunksize=(1, 1013, 1235), chunktype=numpy.ndarray>\n", "Coordinates: (12/46)\n", " time datetime64[ns] 2019-09-24T17:50:...\n", " id <U54 'S2B_MSIL2A_20190924T175049...\n", " * band (band) <U3 'B04' 'B03' 'B02'\n", " * x (x) float64 -1.035e+07 ... -1.02...\n", " * y (y) float64 4.593e+06 ... 4.491e+06\n", " s2:saturated_defective_pixel_percentage float64 0.0\n", " ... ...\n", " proj:transform object {0.0, 300000.0, 10.0, 430...\n", " proj:shape object {10980}\n", " common_name (band) <U5 'red' 'green' 'blue'\n", " center_wavelength (band) float64 0.665 0.56 0.49\n", " full_width_half_max (band) float64 0.038 0.045 0.098\n", " epsg int64 6933\n", "Attributes:\n", " spec: RasterSpec(epsg=6933, bounds=(-10353400, 4491300, -10229900,...\n", " crs: epsg:6933\n", " transform: | 100.00, 0.00,-10353400.00|\\n| 0.00,-100.00, 4592600.00|\\n|...\n", " resolution: 100
<xarray.DataArray 'stackstac-22268e1e94fdf2bbacd6eb69e3399611' (band: 3,\n", " y: 1505, x: 1505)>\n", "dask.array<dask_aware_interpnd, shape=(3, 1505, 1505), dtype=float64, chunksize=(1, 1505, 1505), chunktype=numpy.ndarray>\n", "Coordinates: (12/48)\n", " time datetime64[ns] 2019-09-24T17:50:...\n", " id <U54 'S2B_MSIL2A_20190924T175049...\n", " * band (band) <U3 'B04' 'B03' 'B02'\n", " s2:saturated_defective_pixel_percentage float64 0.0\n", " s2:degraded_msi_data_percentage float64 0.0\n", " s2:water_percentage float64 0.2779\n", " ... ...\n", " full_width_half_max (band) float64 0.038 0.045 0.098\n", " epsg int64 6933\n", " x_6933 (y, x) float64 -1.025e+07 ... -1...\n", " y_6933 (y, x) float64 4.629e+06 ... 4.4...\n", " * y (y) float64 9.908e+06 ... 9.757e+06\n", " * x (x) float64 1.067e+07 ... 1.082e+07\n", "Attributes:\n", " spec: RasterSpec(epsg=6933, bounds=(-10353400, 4491300, -10229900,...\n", " crs: epsg:6933\n", " transform: | 100.00, 0.00,-10353400.00|\\n| 0.00,-100.00, 4592600.00|\\n|...\n", " resolution: 100