{ "cells": [ { "cell_type": "markdown", "id": "c8583784-3144-436c-a623-d48c848e3f88", "metadata": {}, "source": [ "# Create a small sample RiOMar dataset\n", "\n", "## Context\n", "\n", "### Purpose\n", "\n", "The goal is to create a smaller RiOMar dataset to test regridding to Healpix on Pangeo EOSC.\n", "\n", "### Description\n", "\n", "In this notebook, we will:\n", "- Open a RiOMar data file\n", "- Select a few times to reduce the amount of data\n", "- Save the transformed data in Zarr\n", "\n", "## Contributions\n", "\n", "### Notebook\n", "\n", "\n", "- Tina Odaka (author), IFREMER (France), @tinaok\n", "\n", "## Bibliography and other interesting resources\n", "\n", "- [RiOMar](https://coast.ifremer.fr/Laboratoires-Environnement-Ressources/LER-Pertuis-Charentais-La-Tremblade/Projets/RIOMAR-2024-2030)\n", "\n", "\n", "```{warning}\n", "This notebook is designed to run on Datamor, the HPC cluster from IFREMER, where the RiOMar data currently resides. Running the notebook directly on Datamor is necessary because the dataset is large, and processing needs to occur close to the data for efficiency. However, the raw data is openly available online at `https://data-fair2adapt.ifremer.fr/riomar/`.\n", "\n", "To enhance portability, we have included the URLs for the data online. **However, executing it on the cloud would be very slow because the original data is in netCDF format, and we are simply reading it without leveraging chunks, for example, using tools like Kerchunk.**\n", "\n", "```\n", "\n", "## How to set up pangeo enviroment on datarmor for Fair2adapt riomar usecase: \n", "\n", "```bash\n", "ssh datarmor\n", "\n", "micromamba create -n riomar python=3.12 xarray zarr hdf5 ipykernel h5netcdf dask netCDF4 bottleneck scipy cftime numba healpy matplotlib hvplot\n", "pip install git+https://github.com/IAOCEA/xarray-healpy.git\n", "python -m ipykernel install --user --name=riomar\n", "\n", "```\n", "\n", "Then connect to `https://datarmor-jupyterhub.ifremer.fr/`\n" ] }, { "cell_type": "code", "execution_count": null, "id": "c8a8d5ef-ddd4-437b-984b-17146002e3dd", "metadata": {}, "outputs": [], "source": [ "## Import Libraries" ] }, { "cell_type": "code", "execution_count": 6, "id": "baa7c082-97ce-49cc-82b1-424bb7067691", "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import fsspec\n", "from pathlib import Path" ] }, { "cell_type": "markdown", "id": "9b2ee5fb-8c7d-4c4c-a313-ffa33565f6ab", "metadata": {}, "source": [ "## Open Croco grid file \n", "- The grid file is either local if you are running on datamor or accessible via `https` is running elsewhere." ] }, { "cell_type": "code", "execution_count": 7, "id": "95b51dfd-8452-44ec-830c-25b0d4a49e24", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 141MB\n",
"Dimensions: (one: 1, eta_rho: 838, xi_rho: 727, bath: 1, eta_u: 838,\n",
" xi_u: 726, eta_v: 837, xi_v: 727, eta_psi: 837, xi_psi: 726)\n",
"Dimensions without coordinates: one, eta_rho, xi_rho, bath, eta_u, xi_u, eta_v,\n",
" xi_v, eta_psi, xi_psi\n",
"Data variables: (12/34)\n",
" xl (one) float64 8B ...\n",
" el (one) float64 8B ...\n",
" depthmin (one) float64 8B ...\n",
" depthmax (one) float64 8B ...\n",
" spherical (one) |S1 1B ...\n",
" angle (eta_rho, xi_rho) float64 5MB ...\n",
" ... ...\n",
" lat_v (eta_v, xi_v) float64 5MB ...\n",
" lat_psi (eta_psi, xi_psi) float64 5MB ...\n",
" mask_rho (eta_rho, xi_rho) float64 5MB ...\n",
" mask_u (eta_u, xi_u) float64 5MB ...\n",
" mask_v (eta_v, xi_v) float64 5MB ...\n",
" mask_psi (eta_psi, xi_psi) float64 5MB ...\n",
"Attributes:\n",
" title: BOB1000 Model\n",
" date: 09-Mar-2023\n",
" type: CROCO grid file<xarray.Dataset> Size: 73GB\n",
"Dimensions: (time_counter: 744, s_rho: 40, y_rho: 838, x_rho: 727)\n",
"Coordinates:\n",
" * s_rho (s_rho) float32 160B -0.9875 -0.9625 ... -0.0375 -0.0125\n",
" nav_lat_rho (y_rho, x_rho) float32 2MB ...\n",
" nav_lon_rho (y_rho, x_rho) float32 2MB ...\n",
" time_instant (time_counter) datetime64[ns] 6kB ...\n",
" * time_counter (time_counter) datetime64[ns] 6kB 2006-01-01T00:57:45 ... 2...\n",
"Dimensions without coordinates: y_rho, x_rho\n",
"Data variables:\n",
" temp (time_counter, s_rho, y_rho, x_rho) float32 73GB ...\n",
"Attributes: (12/45)\n",
" name: GAMAR_GLORYS_1h_inst\n",
" description: Created by xios\n",
" Conventions: CF-1.6\n",
" timeStamp: 2024-Apr-02 09:15:02 GMT\n",
" uuid: 1563e80a-8c72-4739-a6b5-424221c7cf2b\n",
" title: GAMAR_GLORYS\n",
" ... ...\n",
" gamma2_expl: Slipperiness parameter\n",
" x_sponge: 0.0\n",
" v_sponge: 0.0\n",
" sponge_expl: Sponge parameters : extent (m) & viscosity (m2.s-1)\n",
" SRCS: main.F step.F read_inp.F timers_roms.F init_scalars.F ini...\n",
" CPP-options: REGIONAL GAMAR MPI TIDES OBC_WEST OBC_NORTH XIOS USE_CALE...<xarray.Dataset> Size: 498MB\n",
"Dimensions: (y_rho: 838, x_rho: 727, s_rho: 40, time_counter: 5)\n",
"Coordinates:\n",
" nav_lat_rho (y_rho, x_rho) float64 5MB ...\n",
" nav_lon_rho (y_rho, x_rho) float64 5MB ...\n",
" * s_rho (s_rho) float32 160B -0.9875 -0.9625 ... -0.0375 -0.0125\n",
" * time_counter (time_counter) datetime64[ns] 40B 2006-01-01T00:57:45 ... 2...\n",
" time_instant (time_counter) datetime64[ns] 40B ...\n",
"Dimensions without coordinates: y_rho, x_rho\n",
"Data variables:\n",
" ocean_mask (y_rho, x_rho) bool 609kB ...\n",
" temp (time_counter, s_rho, y_rho, x_rho) float32 487MB ...\n",
"Attributes: (12/45)\n",
" CPP-options: REGIONAL GAMAR MPI TIDES OBC_WEST OBC_NORTH XIOS USE_CALE...\n",
" Conventions: CF-1.6\n",
" Cs_r: have a look at variable Cs_r in this file\n",
" Cs_w: have a look at variable Cs_w in this file\n",
" SRCS: main.F step.F read_inp.F timers_roms.F init_scalars.F ini...\n",
" Tcline: 15.0\n",
" ... ...\n",
" title: GAMAR_GLORYS\n",
" tnu4_expl: biharmonic mixing coefficient for tracers\n",
" units: meter4 second-1\n",
" uuid: 1563e80a-8c72-4739-a6b5-424221c7cf2b\n",
" v_sponge: 0.0\n",
" x_sponge: 0.0