{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Pylops-distributed - ZARR file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook we will learn how to save a numpy array in a zarr file for distributed computation.\n", "\n", "We will use the Reflection response for Marchenko redatuming as sample data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%matplotlib inline\n", "\n", "import os\n", "import shutil\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import scipy as sp\n", "import dask.array as da\n", "import pylops\n", "import pylops_distributed\n", "import zarr\n", "\n", "from pylops.utils import dottest" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set env variable with path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#os.environ[\"STORE_PATH\"] = ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Local cluster" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#client = pylops_distributed.utils.backend.dask(processes=False, threads_per_worker=2,\n", "# n_workers=2)\n", "#client" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SSH cluster" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from dask.distributed import Scheduler, Client\n", "\n", "client = Client('be-linrgsn045:8786')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n",
"Client\n", "
| \n",
"\n",
"Cluster\n", "
| \n",
"
| Type | zarr.core.Array |
|---|---|
| Data type | complex64 |
| Shape | (500, 101, 101) |
| Chunk shape | (125, 101, 101) |
| Order | C |
| Read-only | False |
| Compressor | Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0) |
| Store type | zarr.storage.DirectoryStore |
| No. bytes | 40804000 (38.9M) |
| No. bytes stored | 396 |
| Storage ratio | 103040.4 |
| Chunks initialized | 0/4 |
\n",
"
| \n",
"\n", "\n", " | \n", "
| Type | zarr.core.Array |
|---|---|
| Data type | complex64 |
| Shape | (500, 2451, 2451) |
| Chunk shape | (25, 2451, 2451) |
| Order | C |
| Read-only | False |
| Compressor | Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0) |
| Store type | zarr.storage.DirectoryStore |
| No. bytes | 24029604000 (22.4G) |
| No. bytes stored | 399 |
| Storage ratio | 60224571.4 |
| Chunks initialized | 0/20 |
\n",
"
| \n",
"\n", "\n", " | \n", "