{ "cells": [ { "cell_type": "markdown", "id": "seasonal-allah", "metadata": {}, "source": [ "# Demo with esgf search for CMIP6 at DKRZ site (subset only)\n", "\n", "ESGF Node at DKRZ: https://esgf-data.dkrz.de/search/cmip6-dkrz/" ] }, { "cell_type": "markdown", "id": "chicken-adventure", "metadata": {}, "source": [ "## Use esgf search at DKRZ ... no distributed search\n", "\n", "\n", "\n", "Using ``esgf-pyclient``: \n", "https://esgf-pyclient.readthedocs.io/en/latest/notebooks/examples/search.html" ] }, { "cell_type": "code", "execution_count": null, "id": "union-speaking", "metadata": {}, "outputs": [], "source": [ "from pyesgf.search import SearchConnection\n", "conn = SearchConnection('http://esgf-data.dkrz.de/esg-search',\n", " distrib=False)" ] }, { "cell_type": "markdown", "id": "amateur-education", "metadata": {}, "source": [ "**Search only CMIP6 files locally available at DKRZ**" ] }, { "cell_type": "code", "execution_count": null, "id": "buried-convert", "metadata": {}, "outputs": [], "source": [ "ctx = conn.new_context(project='CMIP6', data_node='esgf3.dkrz.de', latest=True, replica=False)\n", "ctx.hit_count" ] }, { "cell_type": "markdown", "id": "rotary-yacht", "metadata": {}, "source": [ "Select only one dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "engaged-ground", "metadata": {}, "outputs": [], "source": [ "results = ctx.search(\n", " institution_id='MPI-M',\n", " source_id='MPI-ESM1-2-HR',\n", " experiment_id='historical', \n", " variable='tas', \n", " frequency='day',\n", " variant_label='r1i1p1f1'\n", ")\n", "len(results)" ] }, { "cell_type": "code", "execution_count": null, "id": "vulnerable-warner", "metadata": {}, "outputs": [], "source": [ "ds = results[0]\n", "ds.json" ] }, { "cell_type": "markdown", "id": "residential-doctrine", "metadata": {}, "source": [ "Get a dataset identifier used by rook" ] }, { "cell_type": "code", "execution_count": null, "id": "further-nancy", "metadata": {}, "outputs": [], "source": [ "dataset_id = ds.json['instance_id']\n", "dataset_id" ] }, { "cell_type": "markdown", "id": "meaning-engineering", "metadata": {}, "source": [ "Time range" ] }, { "cell_type": "code", "execution_count": null, "id": "parallel-latin", "metadata": {}, "outputs": [], "source": [ "f\"{ds.json['datetime_start']}/{ds.json['datetime_stop']})\"" ] }, { "cell_type": "markdown", "id": "extended-calvin", "metadata": {}, "source": [ "Bounding Box: (West, Sout, East, North)" ] }, { "cell_type": "code", "execution_count": null, "id": "useful-integrity", "metadata": {}, "outputs": [], "source": [ "f\"({ds.json['west_degrees']}, {ds.json['south_degrees']},{ds.json['east_degrees']}, {ds.json['west_degrees']}, {ds.json['north_degrees']})\"\n" ] }, { "cell_type": "markdown", "id": "veterinary-newsletter", "metadata": {}, "source": [ "Size in GB" ] }, { "cell_type": "code", "execution_count": null, "id": "finished-vocabulary", "metadata": {}, "outputs": [], "source": [ "f\"{ds.json['size'] / 1024 / 1024 / 1024} GB\"" ] }, { "cell_type": "markdown", "id": "theoretical-reality", "metadata": {}, "source": [ "## Use Rook to run subset" ] }, { "cell_type": "code", "execution_count": null, "id": "buried-facility", "metadata": {}, "outputs": [], "source": [ "import os\n", "os.environ['ROOK_URL'] = 'http://rook.dkrz.de/wps'\n", "os.environ['ROOK_MODE'] = 'async'\n", "\n", "from rooki import operators as ops" ] }, { "cell_type": "markdown", "id": "starting-capture", "metadata": {}, "source": [ "Run subset workflow\n", "\n", "http://bboxfinder.com/" ] }, { "cell_type": "code", "execution_count": null, "id": "floppy-worship", "metadata": {}, "outputs": [], "source": [ "bbox_africa = \"-23.906250,-35.746512,63.632813,37.996163\"\n", "\n", "wf = ops.Subset(\n", " ops.Input(\n", " 'tas', [dataset_id]\n", " ),\n", " time=\"1850-01-01/1850-12-31\",\n", " area=bbox_africa,\n", " \n", ")\n", "resp = wf.orchestrate()\n", "resp.ok" ] }, { "cell_type": "markdown", "id": "invalid-problem", "metadata": {}, "source": [ "### The outputs are available as a Metalink document\n", "https://github.com/metalink-dev" ] }, { "cell_type": "markdown", "id": "analyzed-circle", "metadata": {}, "source": [ "Metalink URL" ] }, { "cell_type": "code", "execution_count": null, "id": "meaningful-occurrence", "metadata": {}, "outputs": [], "source": [ "resp.url" ] }, { "cell_type": "markdown", "id": "elder-ranking", "metadata": {}, "source": [ "Number of files" ] }, { "cell_type": "code", "execution_count": null, "id": "armed-adolescent", "metadata": {}, "outputs": [], "source": [ "resp.num_files" ] }, { "cell_type": "markdown", "id": "academic-consultation", "metadata": {}, "source": [ "Total size in MB" ] }, { "cell_type": "code", "execution_count": null, "id": "fifty-leisure", "metadata": {}, "outputs": [], "source": [ "resp.size_in_mb" ] }, { "cell_type": "markdown", "id": "forbidden-mountain", "metadata": {}, "source": [ "Download URLs" ] }, { "cell_type": "code", "execution_count": null, "id": "complimentary-hydrogen", "metadata": {}, "outputs": [], "source": [ "resp.download_urls()" ] }, { "cell_type": "markdown", "id": "similar-louisville", "metadata": {}, "source": [ "Download and open with xarray" ] }, { "cell_type": "code", "execution_count": null, "id": "mathematical-hundred", "metadata": {}, "outputs": [], "source": [ "ds_0 = resp.datasets()[0]\n", "ds_0" ] }, { "cell_type": "markdown", "id": "young-sender", "metadata": {}, "source": [ "### Provenance\n", "\n", "Provenance information is given using the *PROV* standard.\n", "https://pypi.org/project/prov/" ] }, { "cell_type": "markdown", "id": "least-pillow", "metadata": {}, "source": [ "Provenance: URL to json document" ] }, { "cell_type": "code", "execution_count": null, "id": "shaped-brighton", "metadata": {}, "outputs": [], "source": [ "resp.provenance()" ] }, { "cell_type": "markdown", "id": "obvious-ridge", "metadata": {}, "source": [ "Provenance Plot" ] }, { "cell_type": "code", "execution_count": null, "id": "compliant-prospect", "metadata": {}, "outputs": [], "source": [ "from IPython.display import Image\n", "Image(resp.provenance_image())" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 5 }