{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian mixture model\n", "\n", "Gaussian mixture model (GMM) is a probabilistic model created by averaging multiple Gaussian density functions.\n", "It is not uncommon to think of these models as a clustering technique because when a model is fitted, it can be used to backtrack which individual density each samples is created from.\n", "However, in `chaospy`, which first and foremost deals with forward problems, sees GMM as a very flexible class of distributions.\n", "\n", "On the most basic level constructing GMM in `chaospy` can be done from a sequence of means and covariances:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.262571Z", "iopub.status.busy": "2021-05-18T10:57:26.262113Z", "iopub.status.idle": "2021-05-18T10:57:26.271457Z", "shell.execute_reply": "2021-05-18T10:57:26.271735Z" } }, "outputs": [ { "data": { "text/plain": [ "GaussianMixture()" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import chaospy\n", "\n", "means = ([0, 1], [1, 1], [1, 0])\n", "covariances = ([[1.0, -0.9], [-0.9, 1.0]],\n", " [[1.0, 0.9], [ 0.9, 1.0]],\n", " [[0.1, 0.0], [ 0.0, 0.1]])\n", "distribution = chaospy.GaussianMixture(means, covariances)\n", "distribution" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.274489Z", "iopub.status.busy": "2021-05-18T10:57:26.274179Z", "iopub.status.idle": "2021-05-18T10:57:26.363215Z", "shell.execute_reply": "2021-05-18T10:57:26.362921Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import numpy\n", "from matplotlib import pyplot\n", "pyplot.rc(\"figure\", figsize=[15, 6], dpi=75)\n", "\n", "xloc, yloc = numpy.mgrid[-2:3:100j, -1:3:100j]\n", "density = distribution.pdf([xloc, yloc])\n", "pyplot.contourf(xloc, yloc, density)\n", "\n", "pyplot.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model fitting\n", "\n", "`chaospy` supports Gaussian mixture model representation, but does not provide an automatic method for constructing them from data.\n", "However, this is something for example `scikit-learn` supports.\n", "It is possible to use `scikit-learn` to fit a model, and use the generated parameters in the `chaospy` implementation.\n", "For example, let us consider the [Iris example from scikit-learn's documentation](https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm_covariances.html) (\"full\" implementation in 2-dimensional representation):" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.365938Z", "iopub.status.busy": "2021-05-18T10:57:26.365618Z", "iopub.status.idle": "2021-05-18T10:57:26.465942Z", "shell.execute_reply": "2021-05-18T10:57:26.465593Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[5.006 3.428 ]\n", " [6.5464 2.9495]\n", " [5.9171 2.778 ]]\n", "[[[0.1218 0.0972]\n", " [0.0972 0.1408]]\n", "\n", " [[0.3874 0.0922]\n", " [0.0922 0.1104]]\n", "\n", " [[0.2755 0.0966]\n", " [0.0966 0.0926]]]\n" ] } ], "source": [ "from sklearn import datasets, mixture\n", "\n", "model = mixture.GaussianMixture(3, random_state=1234)\n", "model.fit(datasets.load_iris().data)\n", "\n", "means = model.means_[:, :2]\n", "covariances = model.covariances_[:, :2, :2]\n", "print(means.round(4))\n", "print(covariances.round(4))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.469079Z", "iopub.status.busy": "2021-05-18T10:57:26.468588Z", "iopub.status.idle": "2021-05-18T10:57:26.613160Z", "shell.execute_reply": "2021-05-18T10:57:26.612792Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "distribution = chaospy.GaussianMixture(means, covariances)\n", "\n", "xloc, yloc = numpy.mgrid[4:8:100j, 1.5:4.5:100j]\n", "density = distribution.pdf([xloc, yloc])\n", "pyplot.contourf(xloc, yloc, density)\n", "\n", "pyplot.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like `scikit-learn`, `chaospy` also support higher dimensions, but that would make the visualization harder." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Low discrepancy sequences\n", "\n", "`chaospy` support low-discrepancy sequences through inverse mapping.\n", "This support extends to mixture models, making the following possible:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.615714Z", "iopub.status.busy": "2021-05-18T10:57:26.615225Z", "iopub.status.idle": "2021-05-18T10:57:26.733413Z", "shell.execute_reply": "2021-05-18T10:57:26.733665Z" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "pseudo_samples = distribution.sample(500, rule=\"additive_recursion\")\n", "\n", "pyplot.scatter(*pseudo_samples)\n", "pyplot.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Chaos expansion\n", "\n", "To be able to do point collocation method it requires the user to have access to sampler from the input distribution and orthogonal polynomials with respect to the input distribution.\n", "The former is available above, while the latter is available as follows:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2021-05-18T10:57:26.735997Z", "iopub.status.busy": "2021-05-18T10:57:26.735682Z", "iopub.status.idle": "2021-05-18T10:57:26.757446Z", "shell.execute_reply": "2021-05-18T10:57:26.757112Z" } }, "outputs": [ { "data": { "text/plain": [ "polynomial([1.0, q1-3.0518, 0.2121*q1+q0-6.4705])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "expansion = chaospy.generate_expansion(1, distribution, rule=\"cholesky\")\n", "\n", "expansion.round(4)" ] } ], "metadata": { "jupytext": { "formats": "ipynb,py:percent" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 4 }