{ "cells": [ { "cell_type": "markdown", "id": "8ea8b39f-23cb-47c9-aa0c-a88f5ede4885", "metadata": {}, "source": [ "## Bulk STAC item queries with GeoParquet\n", "\n", "In addition to its [STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/), the Planetary Computer also provides access to STAC items as [geoparquet datasets](https://github.com/opengeospatial/geoparquet). These parquet datasets can be used for \"bulk\" workloads, where the search might return a very large number of items, or if it might require many separate queries to get your desired result. In general, these parquet datasets are produced with a lag relative to what's available through the STAC API. Most use-cases, including those that need recently added assets, should use our [STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/).\n", "\n", "This example shows how to load STAC items from a Parquet dataset into a [geopandas](https://geopandas.readthedocs.io/) GeoDataFrame. A similar workflow would be possible with R's [geoarrow](https://wcjochem.github.io/sfarrow/index.html) package, or any other library that can read [GeoParquet](https://github.com/opengeospatial/geoparquet#current-implementations--examples)." ] }, { "cell_type": "code", "execution_count": 1, "id": "f6a65f00-2d8b-4d2a-b92b-804990a3ebe4", "metadata": {}, "outputs": [], "source": [ "import dask.dataframe as dd\n", "import geopandas\n", "import planetary_computer\n", "import pystac_client\n", "import pandas as pd\n", "\n", "pd.options.display.max_columns = 8" ] }, { "cell_type": "markdown", "id": "9d445f67-fbf2-4c20-b8ae-7bd7c65308e1", "metadata": {}, "source": [ "### Loading STAC Items\n", "\n", "Each STAC collection providing a geoparquet dataset has a collection-level asset under the `geoparquet-items` key." ] }, { "cell_type": "code", "execution_count": 2, "id": "f24ab006-2d86-41a6-b728-2e0bcb2a2511", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | type | \n", "stac_version | \n", "stac_extensions | \n", "id | \n", "... | \n", "end_datetime | \n", "proj:transform | \n", "start_datetime | \n", "io:supercell_id | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/projection/... | \n", "12Q-2017 | \n", "... | \n", "2018-01-01 00:00:00+00:00 | \n", "[10.0, 0.0, 178910.0, 0.0, -10.0, 2657470.0] | \n", "2017-01-01 00:00:00+00:00 | \n", "12Q | \n", "
1 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/projection/... | \n", "15R-2017 | \n", "... | \n", "2018-01-01 00:00:00+00:00 | \n", "[10.0, 0.0, 194773.70566898846, 0.0, -10.0, 35... | \n", "2017-01-01 00:00:00+00:00 | \n", "15R | \n", "
2 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/projection/... | \n", "16M-2017 | \n", "... | \n", "2018-01-01 00:00:00+00:00 | \n", "[10.0, 0.0, 166023.6435927535, 0.0, -10.0, 999... | \n", "2017-01-01 00:00:00+00:00 | \n", "16M | \n", "
3 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/projection/... | \n", "20L-2022 | \n", "... | \n", "2023-01-01 00:00:00+00:00 | \n", "[10.0, 0.0, 169256.89710350422, 0.0, -10.0, 91... | \n", "2022-01-01 00:00:00+00:00 | \n", "20L | \n", "
4 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/projection/... | \n", "20M-2019 | \n", "... | \n", "2020-01-01 00:00:00+00:00 | \n", "[10.0, 0.0, 166023.6435927521, 0.0, -10.0, 999... | \n", "2019-01-01 00:00:00+00:00 | \n", "20M | \n", "
5 rows × 18 columns
\n", "\n", " | type | \n", "stac_version | \n", "stac_extensions | \n", "id | \n", "... | \n", "s2:high_proba_clouds_percentage | \n", "s2:reflectance_conversion_factor | \n", "s2:medium_proba_clouds_percentage | \n", "s2:saturated_defective_pixel_percentage | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T35XQA_2021041... | \n", "... | \n", "92.546540 | \n", "0.967449 | \n", "4.807670 | \n", "0.0 | \n", "
1 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T32TMM_2021041... | \n", "... | \n", "0.048035 | \n", "0.967449 | \n", "0.051376 | \n", "0.0 | \n", "
2 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T32TMN_2021041... | \n", "... | \n", "0.011238 | \n", "0.967449 | \n", "0.022928 | \n", "0.0 | \n", "
3 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T36WWC_2021041... | \n", "... | \n", "65.812266 | \n", "0.967449 | \n", "19.050561 | \n", "0.0 | \n", "
4 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T36WWD_2021041... | \n", "... | \n", "97.629422 | \n", "0.967449 | \n", "1.861097 | \n", "0.0 | \n", "
5 rows × 42 columns
\n", "\n", " | type | \n", "stac_version | \n", "stac_extensions | \n", "id | \n", "... | \n", "s2:high_proba_clouds_percentage | \n", "s2:reflectance_conversion_factor | \n", "s2:medium_proba_clouds_percentage | \n", "s2:saturated_defective_pixel_percentage | \n", "
---|---|---|---|---|---|---|---|---|---|
27 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T32RMS_2021041... | \n", "... | \n", "0.000000 | \n", "0.967449 | \n", "0.000000 | \n", "0.0 | \n", "
56 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T31PFP_2021041... | \n", "... | \n", "2.169701 | \n", "0.967449 | \n", "1.014810 | \n", "0.0 | \n", "
68 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T31QGU_2021041... | \n", "... | \n", "0.211487 | \n", "0.967449 | \n", "0.171659 | \n", "0.0 | \n", "
77 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T31SGT_2021041... | \n", "... | \n", "1.710171 | \n", "0.967449 | \n", "1.712733 | \n", "0.0 | \n", "
80 | \n", "Feature | \n", "1.0.0 | \n", "[https://stac-extensions.github.io/eo/v1.0.0/s... | \n", "S2A_MSIL2A_20150704T101006_R022_T31QHC_2021041... | \n", "... | \n", "0.000000 | \n", "0.967449 | \n", "0.000000 | \n", "0.0 | \n", "
5 rows × 42 columns
\n", "\n", " | data.file:size | \n", "data.file:values | \n", "data.href | \n", "data.raster:bands | \n", "... | \n", "tilejson.href | \n", "tilejson.roles | \n", "tilejson.title | \n", "tilejson.type | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "53208880 | \n", "[{'summary': 'No Data', 'values': [0]}, {'summ... | \n", "https://ai4edataeuwest.blob.core.windows.net/i... | \n", "[{'nodata': 0, 'spatial_resolution': 10}] | \n", "... | \n", "https://planetarycomputer.microsoft.com/api/da... | \n", "[tiles] | \n", "TileJSON with default rendering | \n", "application/json | \n", "
1 | \n", "114187155 | \n", "[{'summary': 'No Data', 'values': [0]}, {'summ... | \n", "https://ai4edataeuwest.blob.core.windows.net/i... | \n", "[{'nodata': 0, 'spatial_resolution': 10}] | \n", "... | \n", "https://planetarycomputer.microsoft.com/api/da... | \n", "[tiles] | \n", "TileJSON with default rendering | \n", "application/json | \n", "
2 | \n", "53981476 | \n", "[{'summary': 'No Data', 'values': [0]}, {'summ... | \n", "https://ai4edataeuwest.blob.core.windows.net/i... | \n", "[{'nodata': 0, 'spatial_resolution': 10}] | \n", "... | \n", "https://planetarycomputer.microsoft.com/api/da... | \n", "[tiles] | \n", "TileJSON with default rendering | \n", "application/json | \n", "
3 | \n", "165601021 | \n", "[{'summary': 'No Data', 'values': [0]}, {'summ... | \n", "https://ai4edataeuwest.blob.core.windows.net/i... | \n", "[{'nodata': 0, 'spatial_resolution': 10}] | \n", "... | \n", "https://planetarycomputer.microsoft.com/api/da... | \n", "[tiles] | \n", "TileJSON with default rendering | \n", "application/json | \n", "
4 | \n", "97175834 | \n", "[{'summary': 'No Data', 'values': [0]}, {'summ... | \n", "https://ai4edataeuwest.blob.core.windows.net/i... | \n", "[{'nodata': 0, 'spatial_resolution': 10}] | \n", "... | \n", "https://planetarycomputer.microsoft.com/api/da... | \n", "[tiles] | \n", "TileJSON with default rendering | \n", "application/json | \n", "
5 rows × 15 columns
\n", "