{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Measuring differentiation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import scipy\n", "import numpy as np\n", "from sklearn.neighbors import KernelDensity\n", "from sklearn.decomposition import PCA\n", "from sklearn.model_selection import GridSearchCV\n", "from sklearn.cluster import estimate_bandwidth\n", "from sklearn.cluster import MeanShift, estimate_bandwidth\n", "\n", "import pandas as pd\n", "\n", "from scipy import stats\n", "from scipy.stats import beta\n", "from math import sin\n", "from random import randint\n", "\n", "import matplotlib.pyplot as plt\n", "import itertools as it\n", "\n", "import plotly\n", "import plotly.plotly as py\n", "import plotly.graph_objs as go\n", "from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot\n", "\n", "from ipywidgets import interact, interactive, fixed, interact_manual\n", "import ipywidgets as widgets\n", "init_notebook_mode(connected=True)\n", "\n", "import collections\n", "\n", "def recursively_default_dict():\n", " return collections.defaultdict(recursively_default_dict)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do you measure the overlap of two distributions?\n", "\n", "This is a frequently asked question, over which a remarkable number of studies have been published. When faced with a structured data set, the degree, location and timing of contacts between populations we are focusing on can often serve as a direct answer to our questions, or offer a path for further research.\n", "\n", "Existing metrics for this purpose usually take into account the means, standard deviations and sizes of the populations involved, the combinations of which are matched against known distributions in order to extract their significance (Fisher or Student's t distributions are the common resort). \n", "\n", "The use of summary statistics as the basis for these metrics was related to the computing power available to the researchers that developped them, which, for most, was very limited. What i propose here is a pretty bruttish approach to the matter, but one that takes advantage of our modern tools and processors. \n", "\n", "\n", "We will generate three clouds of points (the possible output of dimensionality reduction on a very neat data set). Two of these populations will have their distance proportional to the sinusoide of an 'X' variable (time, physical location, your choice). " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "a2030d5e8c6945cf9e05042ec59a2862", "version_major": 2, "version_minor": 0 }, "text/html": [ "
Failed to display Jupyter Widget of type interactive
.
\n", " If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean\n", " that the widgets JavaScript is still loading. If this message persists, it\n", " likely means that the widgets JavaScript library is either not installed or\n", " not enabled. See the Jupyter\n", " Widgets Documentation for setup instructions.\n", "
\n", "\n", " If you're reading this message in another frontend (for example, a static\n", " rendering on GitHub or NBViewer),\n", " it may mean that your frontend doesn't currently support widgets.\n", "
\n" ], "text/plain": [ "interactive(children=(IntSlider(value=15, description='x', max=30), Output()), _dom_classes=('widget-interact',))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "Failed to display Jupyter Widget of type interactive
.
\n", " If you're reading this message in the Jupyter Notebook or JupyterLab Notebook, it may mean\n", " that the widgets JavaScript is still loading. If this message persists, it\n", " likely means that the widgets JavaScript library is either not installed or\n", " not enabled. See the Jupyter\n", " Widgets Documentation for setup instructions.\n", "
\n", "\n", " If you're reading this message in another frontend (for example, a static\n", " rendering on GitHub or NBViewer),\n", " it may mean that your frontend doesn't currently support widgets.\n", "
\n" ], "text/plain": [ "interactive(children=(Dropdown(description='tup', options=('01', '02', '12'), value='01'), Output()), _dom_classes=('widget-interact',))" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "