{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Load Kerchunked dataset with Xarray\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", " \n", "Within this notebook, we will cover:\n", "\n", "1. How to load a Kerchunk pre-generated reference file into Xarray as if it were a Zarr store.\n", "\n", "## Prerequisites\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| [Kerchunk Basics](../foundations/kerchunk_basics) | Required | Core |\n", "| [Xarray Tutorial](https://tutorial.xarray.dev/intro.html) | Required | Core |\n", "\n", "- **Time to learn**: 45 minutes\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Opening Reference Dataset with Fsspec and Xarray\n", "One way of using our reference dataset is opening it with `Xarray`. To do this, we will create an `fsspec` filesystem and pass it to `Xarray`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create an fsspec reference filesystem from the Kerchunk output\n", "import fsspec\n", "import xarray as xr\n", "import kerchunk\n", "\n", "fs = fsspec.filesystem(\n", " \"reference\",\n", " fo=\"references/ARG_combined.json\",\n", " remote_protocol=\"s3\",\n", " skip_instance_cache=True,\n", ")\n", "m = fs.get_mapper(\"\")\n", "ds = xr.open_dataset(m, engine=\"zarr\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Opening Reference Dataset with Xarray and the `Kerchunk` Engine\n", "As of writing, the latest version of Kerchunk supports opening an reference dataset with Xarray without specifically creating an fsspec filesystem. This is the same behavior as the example above, just a few less lines of code. \n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "storage_options = {\n", " \"remote_protocol\": \"s3\",\n", " \"skip_instance_cache\": True,\n", "} # options passed to fsspec\n", "open_dataset_options = {\"chunks\": {}} # opens passed to xarray\n", "\n", "ds = xr.open_dataset(\n", " \"references/ARG_combined.json\",\n", " engine=\"kerchunk\",\n", " storage_options=storage_options,\n", " open_dataset_options=open_dataset_options,\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }