{ "cells": [ { "cell_type": "markdown", "id": "2f71c854-bf7a-4844-931c-00a1e473fe94", "metadata": {}, "source": [ "# Dimension reduction for data visualization" ] }, { "cell_type": "markdown", "id": "f4c7581f-fb70-435d-976d-8c5a8b586d26", "metadata": {}, "source": [ "Non-linear dimension reductiona and clustering is a powerful tool to very fastly see where are the most significant differences between different spectra and what is driving these differences. \n", "We propose here to use UMAP as a dimension reduction algorithm and plot the 2D representation for different scenarios.\n", "We will start exactly like we did in the other notebook (data_analysis_with_pypam), so we can make sure we have the HMD downloaded." ] }, { "cell_type": "markdown", "id": "a4420218-2114-44d7-9017-0023520ba482", "metadata": { "tags": [] }, "source": [ "## 0. Setup (install)" ] }, { "cell_type": "markdown", "id": "fe031f58-fb63-43d0-a132-1a9052e8be90", "metadata": {}, "source": [ "First we need to install all the packages which we need to exectue this notebook. You don't need to do this if you're using mybinder" ] }, { "cell_type": "code", "execution_count": null, "id": "1ec070b8-cbd0-4326-a674-843d5d4b8a41", "metadata": { "scrolled": true, "tags": [] }, "outputs": [], "source": [ "!pip install pvlib \n", "!pip install lifewatch-pypam==0.3.2 \n", "!pip install minio" ] }, { "cell_type": "markdown", "id": "21d56026-3d95-49bf-b7f1-4595a3134a33", "metadata": { "tags": [] }, "source": [ "## 1. Download the data" ] }, { "cell_type": "markdown", "id": "43277afc-991c-4bc6-a1ca-a07901684e6d", "metadata": {}, "source": [ "We will download some processed HMB data stored in the cloud to give some examples of how can it be used. \n", "These data will be downloaded in this jupyterlab space, and you will be able to find them under the folder you specify here below, organized by station. \n", "Please change this line depending on where you want to store the data. " ] }, { "cell_type": "markdown", "id": "755867a2-b6f9-4982-a25a-38fd27132864", "metadata": {}, "source": [ "We first start importing the packages we need for this part of the notebook" ] }, { "cell_type": "code", "execution_count": null, "id": "8d18f1f5-45b8-4151-ae3c-e8119abf82f9", "metadata": {}, "outputs": [], "source": [ "# Import the necessary packages\n", "import minio\n", "import os\n", "import pathlib\n", "from datetime import datetime\n", "import re" ] }, { "cell_type": "markdown", "id": "39fb588b-4ac9-4827-8296-a0695362a3c0", "metadata": {}, "source": [ "