{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Compare dimensionality reduction methods \n", "\n", "This notebook demonstrates how to perform dimensionality reduction on a column in a `tlc.Table` using two different\n", "dimensionality reduction algorithms, `pacmap` and `umap`.\n", "\n", "![](../images/reduce-mammoth.png)\n", "\n", "\n", "\n", "The Table we will be using in this notebook contains a column of points in 3 dimensions. We reduce these columns to points in 2 dimensions. While a dimensionality reduction from 3 to 2 is not the most typical use case for dimensionality reduction, it is a good way to visualize and compare the effects of different dimensionality reduction algorithms.\n", "\n", "To run this notebook, you must also have run:\n", "* [1-create-mammoth-table.ipynb](../1-create-tables/3d/write-mammoth-table.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Install dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install \"3lc[umap,pacmap]\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import tlc\n", "\n", "# Load the table from the previous example. It contains a single column containing the 3D points.\n", "table = tlc.Table.from_names(table_name=\"mammoth-10k\", dataset_name=\"Mammoth\", project_name=\"3LC Tutorials - Mammoth\")\n", "\n", "table.columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "umap_params_1 = {\n", " \"n_components\": 2, # Project the data to 2 dimensions\n", " \"n_neighbors\": 15, # Local connectivity, fewer neighbors create more local clusters\n", " \"min_dist\": 0.1, # Minimum distance between points in the embedding space, preserves more local structure\n", " \"metric\": \"euclidean\", # Use Euclidean distance to measure similarity\n", " \"retain_source_embedding_column\": True,\n", " \"source_embedding_column\": \"points\",\n", "}\n", "\n", "reduced_umap_1 = tlc.reduce_embeddings(table, method=\"umap\", **umap_params_1)\n", "\n", "umap_params_2 = {\n", " \"n_components\": 2, # Project the data to 2 dimensions\n", " \"n_neighbors\": 50, # Local connectivity, more neighbors create more global structure\n", " \"min_dist\": 0.5, # Minimum distance between points in the embedding space, allows more spread out embedding\n", " \"metric\": \"manhattan\", # Use Manhattan distance to measure similarity\n", " \"retain_source_embedding_column\": True,\n", " \"source_embedding_column\": \"points\",\n", "}\n", "\n", "reduced_umap_2 = tlc.reduce_embeddings(table, method=\"umap\", **umap_params_2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pacmap_param_1 = {\n", " \"n_components\": 2, # Project the data to 2 dimensions\n", " \"n_neighbors\": 10, # Number of neighbors to consider, fewer neighbors emphasize local structure\n", " \"MN_ratio\": 0.5, # Ratio of mid-near pairs, balancing between local and global structure\n", " \"FP_ratio\": 2.0, # Ratio of far pairs, emphasizing the global structure more\n", " \"retain_source_embedding_column\": True,\n", " \"source_embedding_column\": \"points\",\n", "}\n", "\n", "reduced_pacmap_1 = tlc.reduce_embeddings(reduced_umap_2, method=\"pacmap\", **pacmap_param_1)\n", "\n", "pacmap_param_2 = {\n", " \"n_components\": 2, # Project the data to 2 dimensions\n", " \"n_neighbors\": 30, # Number of neighbors to consider, more neighbors emphasize global structure\n", " \"MN_ratio\": 1.0, # Ratio of mid-near pairs, equal balance between local and global structure\n", " \"FP_ratio\": 1.0, # Ratio of far pairs, standard emphasis on global structure\n", " \"retain_source_embedding_column\": True,\n", " \"source_embedding_column\": \"points\",\n", "}\n", "\n", "reduced_pacmap_2 = tlc.reduce_embeddings(reduced_pacmap_1, method=\"pacmap\", **pacmap_param_2)" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.9" } }, "nbformat": 4, "nbformat_minor": 2 }