{ "cells": [ { "cell_type": "markdown", "id": "9974e82c-795b-4f44-9ed0-1d539b1a71de", "metadata": {}, "source": [ "# Exploring slices of stacks by plotting corresponding data\n", "In this notebook we will explore a stack of images using an interactive scatterplot of measurements that were done on the individual slices of the stack. For demonstration purposes, we explore a stack of teaching slides an embedding generated using large language models." ] }, { "cell_type": "code", "execution_count": 1, "id": "24a3041b-56b1-4f44-ad18-9b06c5eb62b6", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import stackview\n", "import os\n", "import numpy as np\n", "from skimage.io import imread\n", "import yaml\n", "import requests\n", "import zipfile\n" ] }, { "cell_type": "markdown", "id": "e0e5c014-8e9d-47cf-a23e-668d80c9a7d7", "metadata": {}, "source": [ "First, we download [the dataset](https://zenodo.org/records/14030307), which is licensed [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) by Robert Haase." ] }, { "cell_type": "code", "execution_count": 2, "id": "3ae7e89b-79cb-43a0-948d-49a7b5bfb32a", "metadata": {}, "outputs": [], "source": [ "# URL of the zip file\n", "url = \"https://zenodo.org/records/14030307/files/png_umap.zip?download=1\"\n", "# Save the zip file locally\n", "zip_path = \"png_umap.zip\"\n", "\n", "if not os.path.exists(zip_path):\n", " # Download the zip file\n", " response = requests.get(url)\n", " with open(zip_path, \"wb\") as f:\n", " f.write(response.content)\n", " \n", " # Extract the contents\n", " with zipfile.ZipFile(zip_path, 'r') as zip_ref:\n", " zip_ref.extractall()\n", "\n", "# Optionally, remove the zip file after extraction\n", "#os.remove(zip_path)" ] }, { "cell_type": "code", "execution_count": 3, "id": "dd5f83a7-f774-4d3e-871a-429775be7c55", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
UMAP0UMAP1filenamepage_indexpng_filenametexturl
02.7852995.12533812623730_14_Summary.pdf012623730_14_Summary_0.pngRobert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/...https://zenodo.org/api/records/12623730/files/...
11.7591095.19602212623730_14_Summary.pdf112623730_14_Summary_1.pngRobert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/...https://zenodo.org/api/records/12623730/files/...
21.6058596.08449112623730_14_Summary.pdf212623730_14_Summary_2.pngRobert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/...https://zenodo.org/api/records/12623730/files/...
31.5819076.08469512623730_14_Summary.pdf312623730_14_Summary_3.pngRobert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/...https://zenodo.org/api/records/12623730/files/...
42.1631197.16110212623730_14_Summary.pdf412623730_14_Summary_4.pngRobert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/...https://zenodo.org/api/records/12623730/files/...
\n", "
" ], "text/plain": [ " UMAP0 UMAP1 filename page_index \\\n", "0 2.785299 5.125338 12623730_14_Summary.pdf 0 \n", "1 1.759109 5.196022 12623730_14_Summary.pdf 1 \n", "2 1.605859 6.084491 12623730_14_Summary.pdf 2 \n", "3 1.581907 6.084695 12623730_14_Summary.pdf 3 \n", "4 2.163119 7.161102 12623730_14_Summary.pdf 4 \n", "\n", " png_filename \\\n", "0 12623730_14_Summary_0.png \n", "1 12623730_14_Summary_1.png \n", "2 12623730_14_Summary_2.png \n", "3 12623730_14_Summary_3.png \n", "4 12623730_14_Summary_4.png \n", "\n", " text \\\n", "0 Robert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/... \n", "1 Robert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/... \n", "2 Robert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/... \n", "3 Robert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/... \n", "4 Robert Haase\\n@haesleinhuepf\\nBIDS Lecture 14/... \n", "\n", " url \n", "0 https://zenodo.org/api/records/12623730/files/... \n", "1 https://zenodo.org/api/records/12623730/files/... \n", "2 https://zenodo.org/api/records/12623730/files/... \n", "3 https://zenodo.org/api/records/12623730/files/... \n", "4 https://zenodo.org/api/records/12623730/files/... " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Read YAML file\n", "with open('png_umap/png_umap.yml', 'r') as file:\n", " loaded_dict = yaml.safe_load(file)\n", "\n", "\n", "df = pd.DataFrame(loaded_dict)\n", "\n", "# Show first few rows of the loaded DataFrame\n", "df.head()" ] }, { "cell_type": "markdown", "id": "0822cc8e-46c9-47ce-9d4b-181913dd1eb3", "metadata": {}, "source": [ "We also define a helper function that loads all images mentioned in a dataframe into one big numpy array." ] }, { "cell_type": "code", "execution_count": 4, "id": "5cf520c5-4a1b-4708-b39e-de9922cb5680", "metadata": {}, "outputs": [], "source": [ "def get_images(selected_rows):\n", " # Load images for selected pages\n", " images = []\n", " for _, row in selected_rows.iterrows():\n", " img_path = os.path.join('png_umap', row['png_filename'])\n", " img = imread(img_path)\n", " images.append(img)\n", "\n", " if len(images) == 0:\n", " return np.zeros((2,2,2))\n", " else:\n", " return np.asarray(images)" ] }, { "cell_type": "markdown", "id": "4829158d-d705-4070-ba65-a9dd64f3ebbc", "metadata": {}, "source": [ "We can use the `sliceplot` function of stackview to visualize the embedding next to selected slides." ] }, { "cell_type": "code", "execution_count": 5, "id": "feac021f-d0a5-4f92-b8dd-4b23c90be0bb", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "bc68c6c84230415bb4ea64701670d2f1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HBox(children=(VBox(children=(VBox(children=(HBox(children=(VBox(children=(ImageWidget(height=4…" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stackview.sliceplot(df, get_images(df), column_x=\"UMAP0\", column_y=\"UMAP1\", zoom_factor=1.5, zoom_spline_order=2)" ] }, { "cell_type": "markdown", "id": "b871cc66-6088-42ab-9708-42858100ab52", "metadata": {}, "source": [ "## Exercise\n", "Explore the plot by dragging lines around islands of datapoints with the mouse. What content are these islands about?" ] }, { "cell_type": "code", "execution_count": null, "id": "1cc76f41-1668-45a4-82aa-61d4cb3dd783", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 5 }