{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Example of loading a multi-resolution Zarr image from a public S3 repository\n",
"\n",
"The images are taken from the paper \"SARS-CoV-2 productively Infects Human Gut Enterocytes\" published May 2020 in Science: https://doi.org/10.1126/science.abc1669\n",
"\n",
"The electron micrograph images can be viewed online in the [Image Data Resource](https://idr.openmicroscopy.org/webclient/?show=dataset-10201). Both images are over 13 gigapixels each!\n",
"- [hSIOs-1 (79360 x 167424 px, image ID 9822151)](https://idr.openmicroscopy.org/webclient/img_detail/9822151/?dataset=10201)\n",
"- [hSIOs-2 (144384 x 93184 px, image ID 9822152)](https://idr.openmicroscopy.org/webclient/img_detail/9822152/?dataset=10201)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install dependencies if required\n",
"The cell below will install dependencies if you choose to run the notebook in [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb#recent=true). "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install zarr fsspec>=0.3.3 aiohttp"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import dask.array as da\n",
"from IPython.display import display, Image\n",
"from matplotlib import pyplot as plt\n",
"import requests\n",
"import zarr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The Zarr data is stored separately from the IDR, on an S3 object store"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# imageid = 9822151\n",
"imageid = 9822152\n",
"\n",
"endpoint = 'https://uk1s3.embassy.ebi.ac.uk'\n",
"imagepath = f'idr/zarr/v0.1/{imageid}.zarr'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The original image is over 25GB but with the help of [Dask](https://dask.org/) it is easy to lazily load just the required regions of the image.\n",
"\n",
"Images are stored as 5D arrays: multi-channel (`C`) 3D (`X Y Z`) timelapse (`T`) images. The order of the array dimensions are `(T, C, Z, Y, X)`. Since these images are so big downsampled versions (\"multi-resolutions\") have also been calculated.\n",
"\n",
"The list of resolutions is stored in a JSON file called `.zattrs`. Dask automatically creates a nice summary of each resolution in Jupyter. These resolutions will always be stored in order from the largest image (most detailed resolution) to the smallest. Note that although the name of each resolution may be informative this is not always the case so you should not rely on it.\n",
"\n",
"For the official specification of Zarr images see https://github.com/ome/ngff"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Resolution: \"0\"\n"
]
},
{
"data": {
"text/html": [
"