{ "cells": [ { "cell_type": "markdown", "id": "auburn-marketing", "metadata": {}, "source": [ "# Topic 2: Cloud Optimized Data" ] }, { "cell_type": "markdown", "id": "imperial-sequence", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "earned-bridge", "metadata": {}, "source": [ "In this example we will access the NASA HLS assets, which are archived in cloud optimized geoTIFF (COG) format in the LP DAAC Cumulus cloud space. The COGs can be used like any other geoTIFF file, but have some added features that make them more efficient within the cloud data access paradigm. These features include: overviews and internal tiling. Below we will demonstrate how to leverage these features." ] }, { "cell_type": "code", "execution_count": null, "id": "careful-guatemala", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import os\n", "import requests\n", "import boto3\n", "import rasterio as rio # https://rasterio.readthedocs.io/en/latest/\n", "from rasterio.plot import show\n", "from rasterio.session import AWSSession\n", "import rioxarray # https://corteva.github.io/rioxarray/stable/index.html\n", "import pandas\n", "import geopandas\n", "import pyproj\n", "from pyproj import Proj\n", "from shapely.ops import transform\n", "import geoviews as gv\n", "from cartopy import crs\n", "import hvplot.xarray\n", "import hvplot.pandas\n", "import holoviews as hv\n", "gv.extension('bokeh', 'matplotlib')" ] }, { "cell_type": "markdown", "id": "hungry-emperor", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "functional-sweet", "metadata": {}, "source": [ "## Set GDAL Configuration Options" ] }, { "cell_type": "markdown", "id": "ranging-publication", "metadata": {}, "source": [ "**First, let us set the gdal configuration options for this session**" ] }, { "cell_type": "code", "execution_count": null, "id": "smart-recovery", "metadata": {}, "outputs": [], "source": [ "def get_temp_creds():\n", " temp_creds_url = 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'\n", " return requests.get(temp_creds_url).json()" ] }, { "cell_type": "code", "execution_count": null, "id": "horizontal-detector", "metadata": {}, "outputs": [], "source": [ "temp_creds_req = get_temp_creds()\n", "#temp_creds_req" ] }, { "cell_type": "code", "execution_count": null, "id": "patient-darkness", "metadata": {}, "outputs": [], "source": [ "session = boto3.Session(aws_access_key_id=temp_creds_req['accessKeyId'], \n", " aws_secret_access_key=temp_creds_req['secretAccessKey'],\n", " aws_session_token=temp_creds_req['sessionToken'],\n", " region_name='us-west-2')" ] }, { "cell_type": "code", "execution_count": null, "id": "figured-insight", "metadata": {}, "outputs": [], "source": [ "rio_env = rio.Env(AWSSession(session),\n", " GDAL_DISABLE_READDIR_ON_OPEN='TRUE',\n", " CPL_VSIL_CURL_ALLOWED_EXTENSIONS='tif',\n", " VSI_CACHE=True,\n", " region_name='us-west-2',\n", " GDAL_HTTP_COOKIEFILE=os.path.expanduser('~/cookies.txt'),\n", " GDAL_HTTP_COOKIEJAR=os.path.expanduser('~/cookies.txt'))\n", "rio_env.__enter__()" ] }, { "cell_type": "markdown", "id": "billion-editing", "metadata": {}, "source": [ "## Working with Overviews " ] }, { "cell_type": "markdown", "id": "vietnamese-thriller", "metadata": {}, "source": [ "**Access a single HLS asset to identify the overview levels**" ] }, { "cell_type": "code", "execution_count": null, "id": "communist-teacher", "metadata": {}, "outputs": [], "source": [ "foa_url = \"https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/HLSS30.020/HLS.S30.T13TGF.2020274T174141.v2.0/HLS.S30.T13TGF.2020274T174141.v2.0.B04.tif\"" ] }, { "cell_type": "code", "execution_count": null, "id": "decreased-netscape", "metadata": {}, "outputs": [], "source": [ "with rio.open(foa_url) as src:\n", " hls_ov_levels = src.overviews(1)\n", " \n", "hls_ov_levels" ] }, { "cell_type": "markdown", "id": "raising-journey", "metadata": {}, "source": [ "**Request the second overview level from the asset (`overview_level=1`)**" ] }, { "cell_type": "code", "execution_count": null, "id": "boxed-cologne", "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "with rioxarray.open_rasterio(foa_url, masked=True, overview_level=1, chunks=True) as src: # https://nbviewer.jupyter.org/gist/rsignell-usgs/f4dd62ad1274c5b5ed69e5a6b81c1295 & http://rasterio.readthedocs.io/en/latest/topics/resampling.html\n", " print(src)\n", " \n", "src.hvplot.image(x='x', y='y', width=800, height=600)" ] }, { "cell_type": "markdown", "id": "dietary-joining", "metadata": {}, "source": [ "**Request the last overview level (`overview_level=3`)**" ] }, { "cell_type": "code", "execution_count": null, "id": "physical-balance", "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "with rioxarray.open_rasterio(foa_url, masked=True, overview_level=3, chunks=True) as src:\n", " print(src)\n", "\n", "src.hvplot.image(x = 'x', y = 'y', width=800, height=600)" ] }, { "cell_type": "markdown", "id": "solid-kenya", "metadata": {}, "source": [ "## Requesting Spatial Subsets" ] }, { "cell_type": "markdown", "id": "resident-devon", "metadata": {}, "source": [ "![COG tiling example](../img/COG_Smile_AOI.png)" ] }, { "cell_type": "markdown", "id": "technical-armor", "metadata": {}, "source": [ "**Now, we will read in a geojson file and use its bounding box to clip the cloud asset in later steps**" ] }, { "cell_type": "code", "execution_count": null, "id": "bottom-tomorrow", "metadata": {}, "outputs": [], "source": [ "os.listdir(\"../data/\")" ] }, { "cell_type": "code", "execution_count": null, "id": "affiliated-michael", "metadata": {}, "outputs": [], "source": [ "field = geopandas.read_file('../data/ne_w_agfields.geojson')" ] }, { "cell_type": "code", "execution_count": null, "id": "multiple-virtue", "metadata": {}, "outputs": [], "source": [ "type(field)" ] }, { "cell_type": "code", "execution_count": null, "id": "better-trainer", "metadata": {}, "outputs": [], "source": [ "field" ] }, { "cell_type": "code", "execution_count": null, "id": "still-cause", "metadata": {}, "outputs": [], "source": [ "fieldShape = field['geometry'][0] # Define the geometry as a shapely polygon\n", "\n", "fieldShape" ] }, { "cell_type": "markdown", "id": "palestinian-offer", "metadata": {}, "source": [ "Get the lower-left and upper-right coordinates" ] }, { "cell_type": "code", "execution_count": null, "id": "reliable-garbage", "metadata": {}, "outputs": [], "source": [ "fieldShape.bounds" ] }, { "cell_type": "markdown", "id": "intermediate-programming", "metadata": {}, "source": [ "Use geoviews to combine a basemap with the shapely polygon of our Region of Interest (ROI)" ] }, { "cell_type": "code", "execution_count": null, "id": "equipped-netherlands", "metadata": {}, "outputs": [], "source": [ "base = gv.tile_sources.EsriImagery.opts(width=650, height=500)\n", "farmField = gv.Polygons(fieldShape).opts(line_color='yellow', line_width=10, color=None)\n", "\n", "base * farmField" ] }, { "cell_type": "markdown", "id": "rubber-cigarette", "metadata": {}, "source": [ "### Requests an area of interest" ] }, { "cell_type": "markdown", "id": "downtown-failure", "metadata": {}, "source": [ "**Transform coordinates from lat lon (units = dd) to UTM (units = m)**" ] }, { "cell_type": "code", "execution_count": null, "id": "incident-angel", "metadata": {}, "outputs": [], "source": [ "with rio.open(foa_url) as src:\n", " hls_proj = src.crs.to_string()\n", " \n", "hls_proj" ] }, { "cell_type": "code", "execution_count": null, "id": "final-occupation", "metadata": {}, "outputs": [], "source": [ "geo_CRS = Proj('+proj=longlat +datum=WGS84 +no_defs', preserve_units=True) # Source coordinate system of the ROI\n", "project = pyproj.Transformer.from_proj(geo_CRS, hls_proj) # Set up the transformation\n", "fsUTM = transform(project.transform, fieldShape) # Transform" ] }, { "cell_type": "markdown", "id": "reflected-definition", "metadata": {}, "source": [ "**Print the transformed bounds (now in meters)**" ] }, { "cell_type": "code", "execution_count": null, "id": "reliable-correction", "metadata": {}, "outputs": [], "source": [ "fsUTM.bounds" ] }, { "cell_type": "markdown", "id": "indoor-participant", "metadata": {}, "source": [ "**Use fsUTM to subset the source HLS tile**" ] }, { "cell_type": "markdown", "id": "numeric-caribbean", "metadata": {}, "source": [ "**Requests data at full extent**" ] }, { "cell_type": "code", "execution_count": null, "id": "widespread-brave", "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "rds = rioxarray.open_rasterio(foa_url, masked=True, chunks=True)\n", "rds" ] }, { "cell_type": "markdown", "id": "higher-cyprus", "metadata": {}, "source": [ "**Print the spatial reference attribute**" ] }, { "cell_type": "code", "execution_count": null, "id": "possible-coordination", "metadata": {}, "outputs": [], "source": [ "#rds.spatial_ref\n", "rds.spatial_ref.attrs" ] }, { "cell_type": "markdown", "id": "governmental-detective", "metadata": {}, "source": [ "**Plot data at full extent**" ] }, { "cell_type": "code", "execution_count": null, "id": "governing-daily", "metadata": {}, "outputs": [], "source": [ "rds[0].hvplot.image(x = 'x', y = 'y', crs = hls_proj, rasterize=True, cmap='Reds', width=800, height=600, colorbar=True, tiles = 'ESRI')" ] }, { "cell_type": "markdown", "id": "color-pantyhose", "metadata": {}, "source": [ "**Request data that intersects with the input geoJSON boundary only**" ] }, { "cell_type": "code", "execution_count": null, "id": "neural-attention", "metadata": {}, "outputs": [], "source": [ "%time\n", "\n", "rds_clipped = rioxarray.open_rasterio(foa_url, masked=True).rio.clip([fsUTM]) # Note: fsUTM must be in a list\n", "rds_clipped" ] }, { "cell_type": "code", "execution_count": null, "id": "frank-venue", "metadata": {}, "outputs": [], "source": [ "rds_clipped[0].hvplot.image(x = 'x', y = 'y', crs = hls_proj, rasterize=True, cmap='Reds', width=800, height=600, colorbar=True, tiles = 'ESRI')" ] }, { "cell_type": "markdown", "id": "broke-ideal", "metadata": {}, "source": [ "**Add the field boudary to the clipped image**" ] }, { "cell_type": "code", "execution_count": null, "id": "generous-shoot", "metadata": {}, "outputs": [], "source": [ "rds_clipped[0].hvplot.image(x = 'x', y = 'y', crs = hls_proj, rasterize=True, cmap='Reds', width=800, height=600, colorbar=True, tiles = 'ESRI') * farmField" ] }, { "cell_type": "markdown", "id": "infrared-expression", "metadata": {}, "source": [ "**Get the Geotransformation information for the full tile**" ] }, { "cell_type": "code", "execution_count": null, "id": "driving-cholesterol", "metadata": {}, "outputs": [], "source": [ "rds.spatial_ref.GeoTransform" ] }, { "cell_type": "markdown", "id": "residential-competition", "metadata": {}, "source": [ "**Geotransformation information for the clipped image**" ] }, { "cell_type": "code", "execution_count": null, "id": "hybrid-bangkok", "metadata": {}, "outputs": [], "source": [ "rds_clipped.spatial_ref.GeoTransform" ] }, { "cell_type": "markdown", "id": "differential-accountability", "metadata": {}, "source": [ "### Request data for a point of interest" ] }, { "cell_type": "code", "execution_count": null, "id": "forward-corruption", "metadata": {}, "outputs": [], "source": [ "from pyproj import Transformer\n", "\n", "# convert coordinate to raster projection\n", "lon = -101.66786\n", "lat = 41.05679" ] }, { "cell_type": "code", "execution_count": null, "id": "simple-choice", "metadata": {}, "outputs": [], "source": [ "transformer = Transformer.from_crs(\"EPSG:4326\", rioxarray.open_rasterio(foa_url, masked=True).rio.crs, always_xy=True)\n", "xx, yy = transformer.transform(lon, lat)\n", "print(f'X,Y in source units: {xx},{yy}')" ] }, { "cell_type": "code", "execution_count": null, "id": "political-glossary", "metadata": {}, "outputs": [], "source": [ "# get value from grid\n", "value = rds.sel(x=xx, y=yy, method=\"nearest\").values\n", "value" ] }, { "cell_type": "markdown", "id": "supposed-rocket", "metadata": {}, "source": [ "**Plot the point with the field boundary and the area extraction**" ] }, { "cell_type": "code", "execution_count": null, "id": "chicken-closing", "metadata": {}, "outputs": [], "source": [ "point_df = pandas.DataFrame({'x':[xx],\n", " 'y':[yy],\n", " 'value':[value[0]]})" ] }, { "cell_type": "code", "execution_count": null, "id": "downtown-equation", "metadata": {}, "outputs": [], "source": [ "image = rds_clipped[0].hvplot.image(x = 'x', y = 'y', crs = hls_proj, rasterize=True, cmap='Reds', width=800, height=600, colorbar=True, tiles = 'ESRI') \n", "point = point_df.hvplot.points('x', 'y', hover_cols='value', crs=hls_proj, color='yellow', size=100) # https://stackoverflow.com/questions/59678780/show-extra-columns-when-hovering-in-a-scatter-plot-with-hvplot\n", "\n", "image * farmField * point" ] }, { "cell_type": "markdown", "id": "resistant-sussex", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "numerical-natural", "metadata": {}, "source": [ "## Resources\n", "\n", "- https://nbviewer.jupyter.org/gist/rsignell-usgs/f4dd62ad1274c5b5ed69e5a6b81c1295 \n", "- http://rasterio.readthedocs.io/en/latest/topics/resampling.html \n", "- https://gis.stackexchange.com/questions/358036/extracting-data-from-a-raster/358058#358058" ] }, { "cell_type": "markdown", "id": "69657eea-428c-440e-8267-4feb93bc5516", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "id": "7fa3ff9d-8557-49db-988b-960713780606", "metadata": {}, "source": [ "## [Next: Topic 3 - Data Discovery - CMR-STAC API](Topic_3__Data_Discovery_STAC_CMR-STAC_API.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 5 }