{ "cells": [ { "cell_type": "markdown", "id": "96d5e526-425b-4319-8043-828208956ef9", "metadata": {}, "source": [ "# Cluster plots\n", "Using `stackview.clusterplot` we can visualize contents of pandas DataFrames and corresponding segmented objects in an sime side-by-side. In such a plot you can select objects and visualize the selection. This might be useful for exploring feature extraction parameter spaces." ] }, { "cell_type": "code", "execution_count": 1, "id": "f4ec4d56-e298-40d8-a5b5-836cbcc2897d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.12.0'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "import stackview\n", "import pandas as pd\n", "from skimage.measure import regionprops_table\n", "from skimage.io import imread\n", "from skimage.filters import threshold_otsu\n", "from skimage.measure import label\n", "import matplotlib.pyplot as plt\n", "from sklearn.preprocessing import StandardScaler \n", "from umap import UMAP\n", "\n", "stackview.__version__" ] }, { "cell_type": "markdown", "id": "dfddfa23-9290-425e-b8e5-2fd4d586db9a", "metadata": {}, "source": [ "To demonstrate this, we need an image, a segmentation and a table of extracted features." ] }, { "cell_type": "code", "execution_count": 2, "id": "e252c694-f7af-4b0d-9bbe-e6f682e8ab5c", "metadata": {}, "outputs": [], "source": [ "image = imread('data/blobs.tif')\n", "\n", "# segment image\n", "thresh = threshold_otsu(image)\n", "binary_image = image > thresh\n", "labeled_image = label(binary_image)" ] }, { "cell_type": "code", "execution_count": 4, "id": "9e84d0b9-3b2f-42a9-b28a-b61f729d2b52", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\rober\\miniforge3\\envs\\bob-env\\Lib\\site-packages\\umap\\umap_.py:1952: UserWarning: n_jobs value 1 overridden to 1 by setting random_state. Use no seed for parallelism.\n", " warn(\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mean_intensitystd_intensitycentroid-0centroid-1areaferet_diameter_maxminor_axis_lengthmajor_axis_lengthUMAP1UMAP2
0190.85450330.26991113.21247119.986143433.036.05551316.81906034.9573994.4465890.901159
1179.28648621.8240904.27027062.945946185.021.37755811.80385421.0614172.342915-0.930705
2205.61702129.35847712.568389108.329787658.032.44996128.27826430.2125524.9110470.156550
3217.32718936.0195659.806452154.520737434.026.92582423.06407924.5353984.941196-0.982479
4212.14255829.87290713.545073246.809224477.031.38471019.83305831.1626125.321925-1.058476
\n", "
" ], "text/plain": [ " mean_intensity std_intensity centroid-0 centroid-1 area \\\n", "0 190.854503 30.269911 13.212471 19.986143 433.0 \n", "1 179.286486 21.824090 4.270270 62.945946 185.0 \n", "2 205.617021 29.358477 12.568389 108.329787 658.0 \n", "3 217.327189 36.019565 9.806452 154.520737 434.0 \n", "4 212.142558 29.872907 13.545073 246.809224 477.0 \n", "\n", " feret_diameter_max minor_axis_length major_axis_length UMAP1 \\\n", "0 36.055513 16.819060 34.957399 4.446589 \n", "1 21.377558 11.803854 21.061417 2.342915 \n", "2 32.449961 28.278264 30.212552 4.911047 \n", "3 26.925824 23.064079 24.535398 4.941196 \n", "4 31.384710 19.833058 31.162612 5.321925 \n", "\n", " UMAP2 \n", "0 0.901159 \n", "1 -0.930705 \n", "2 0.156550 \n", "3 -0.982479 \n", "4 -1.058476 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "properties = regionprops_table(labeled_image, intensity_image=image, properties=[\n", " 'mean_intensity', 'std_intensity',\n", " 'centroid', 'area', 'feret_diameter_max', \n", " 'minor_axis_length', 'major_axis_length'])\n", "\n", "df = pd.DataFrame(properties)\n", "\n", "# Select numeric columns\n", "numeric_cols = df.select_dtypes(include=[np.number]).columns\n", "\n", "# Scale the data\n", "scaler = StandardScaler()\n", "scaled_data = scaler.fit_transform(df[numeric_cols])\n", "\n", "# Create UMAP embedding\n", "umap = UMAP(n_components=2, random_state=42) \n", "umap_coords = umap.fit_transform(scaled_data)\n", "\n", "# Add UMAP coordinates to dataframe \n", "df['UMAP1'] = umap_coords[:, 0]\n", "df['UMAP2'] = umap_coords[:, 1]\n", "\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 5, "id": "1e8ea1bf-3042-4220-9225-e478623c8261", "metadata": {}, "outputs": [], "source": [ "num_objects = df.shape[0]\n", "pre_selection = np.zeros(num_objects)\n", "pre_selection[:int(num_objects/2)] = 1\n", "\n", "df[\"selection\"] = pre_selection" ] }, { "cell_type": "markdown", "id": "668151cb-4a89-43a8-bbb0-7fd4fe54414a", "metadata": {}, "source": [ "## Interaction\n", "Using some more involved code we can also draw the image and the scatter plot side-by-side and make them interact. You can select data points in the plot on the right and the visualization on the left will be updated accordingly." ] }, { "cell_type": "code", "execution_count": 9, "id": "7b2bbd63-3255-4ada-94a6-b77207c8efaf", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f05e304104fc4597bd76f6b244268060", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HBox(children=(HBox(children=(VBox(children=(VBox(children=(HBox(children=(VBox(children=(Image…" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stackview.clusterplot(image=image,\n", " labels=labeled_image,\n", " df=df,\n", " column_x=\"centroid-0\",\n", " column_y=\"centroid-1\",\n", " zoom_factor=1.5,\n", " markersize=15)" ] }, { "cell_type": "markdown", "id": "f9b8afd9-9f0a-4f3e-967f-f7680de602a9", "metadata": {}, "source": [ "Every time the user selects different data points, the selection in our dataframe is update" ] }, { "cell_type": "code", "execution_count": 10, "id": "0aa32ebb-7539-48e1-a8c4-be0deb255d05", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 False\n", "1 True\n", "2 False\n", "3 False\n", "4 False\n", " ... \n", "59 True\n", "60 True\n", "61 True\n", "62 True\n", "63 True\n", "Name: selection, Length: 64, dtype: bool" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"selection\"]" ] }, { "cell_type": "code", "execution_count": null, "id": "e54890a5-e89c-4861-bf5b-8aa1c5f7f697", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.10" } }, "nbformat": 4, "nbformat_minor": 5 }