{ "cells": [ { "cell_type": "markdown", "id": "bbf78031", "metadata": {}, "source": [ "# Lollipop Plot\n", "\n", "A lollipop plot displays each element of a dataset as a segment and a circle. It is usually combined with the `count` stat, and is especially useful when you have several bars of the same height." ] }, { "cell_type": "markdown", "id": "69de96f3", "metadata": {}, "source": [ "1. [Parameters `size`, `stroke` and `linewidth`](#stroke)\n", "\n", "2. [Parameter `fatten`](#fatten)\n", "\n", "3. [Horizontal Sticks](#direction)\n", "\n", "4. [Sloped Baseline](#slope)\n", "\n", "5. [Parameter `stat`](#stat)\n", "\n", "6. [Lollipops in Marginal Layer](#ggmarginal)\n", "\n", "7. [Lollipops and a Regression Line](#slope_and_intercept)\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "4e5f30b9", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:03.252144Z", "iopub.status.busy": "2024-04-26T11:48:03.252144Z", "iopub.status.idle": "2024-04-26T11:48:05.645840Z", "shell.execute_reply": "2024-04-26T11:48:05.645840Z" } }, "outputs": [], "source": [ "import random\n", "import pandas as pd\n", "from sklearn.linear_model import LinearRegression\n", "\n", "from lets_plot import *" ] }, { "cell_type": "code", "execution_count": 2, "id": "aca5e584", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.645840Z", "iopub.status.busy": "2024-04-26T11:48:05.645840Z", "iopub.status.idle": "2024-04-26T11:48:05.661581Z", "shell.execute_reply": "2024-04-26T11:48:05.661581Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "LetsPlot.setup_html()" ] }, { "cell_type": "code", "execution_count": 3, "id": "1a13082d", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.661581Z", "iopub.status.busy": "2024-04-26T11:48:05.661581Z", "iopub.status.idle": "2024-04-26T11:48:05.677206Z", "shell.execute_reply": "2024-04-26T11:48:05.677206Z" } }, "outputs": [], "source": [ "data = {\n", " 'x': [v - 15 for v in range(30)],\n", " 'y': [random.uniform(1, 5) for _ in range(30)],\n", " 'sugar': [v + 150 for v in range(30)]\n", "}" ] }, { "cell_type": "code", "execution_count": 4, "id": "fd78698e", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.677206Z", "iopub.status.busy": "2024-04-26T11:48:05.677206Z", "iopub.status.idle": "2024-04-26T11:48:05.804022Z", "shell.execute_reply": "2024-04-26T11:48:05.804022Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ggplot(data, aes('x', 'y')) + geom_lollipop() + ggsize(600, 200)" ] }, { "cell_type": "markdown", "id": "79609dc2", "metadata": {}, "source": [ "\n", "\n", "#### 1. Parameters `size`, `stroke` and `linewidth`" ] }, { "cell_type": "code", "execution_count": 5, "id": "5bdea834", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.806423Z", "iopub.status.busy": "2024-04-26T11:48:05.806423Z", "iopub.status.idle": "2024-04-26T11:48:05.820071Z", "shell.execute_reply": "2024-04-26T11:48:05.820071Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gggrid([\n", " ggplot(data, aes('x', 'y', size='sugar')) + geom_lollipop() + ggtitle(\"variable 'size'\"),\n", " ggplot(data, aes('x', 'y', size='sugar', stroke='sugar')) + geom_lollipop() + ggtitle(\"variable 'size' and 'stroke'\"),\n", " ggplot(data, aes('x', 'y', size='sugar', linewidth='sugar')) + geom_lollipop() + ggtitle(\"variable 'size' and 'linewidth'\")\n", "], ncol=1) + ggsize(800, 800)\n", "\n" ] }, { "cell_type": "markdown", "id": "d72102d6", "metadata": {}, "source": [ "\n", "\n", "#### 2. Parameter `fatten`" ] }, { "cell_type": "code", "execution_count": 6, "id": "a68c6d9c", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.822043Z", "iopub.status.busy": "2024-04-26T11:48:05.822043Z", "iopub.status.idle": "2024-04-26T11:48:05.835972Z", "shell.execute_reply": "2024-04-26T11:48:05.835972Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gggrid([\n", " ggplot(data, aes('x', 'y')) + geom_lollipop() + ggtitle(\"fatten=2.5 (default)\"),\n", " ggplot(data, aes('x', 'y')) + geom_lollipop(fatten=5) + ggtitle(\"fatten=5\"),\n", "])" ] }, { "cell_type": "markdown", "id": "de1e9b3f", "metadata": {}, "source": [ "\n", "\n", "#### 3. Horizontal Sticks" ] }, { "cell_type": "code", "execution_count": 7, "id": "d5d35123", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.835972Z", "iopub.status.busy": "2024-04-26T11:48:05.835972Z", "iopub.status.idle": "2024-04-26T11:48:05.852859Z", "shell.execute_reply": "2024-04-26T11:48:05.851745Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ggplot(data, aes('y', 'x')) + geom_lollipop(dir=\"h\")" ] }, { "cell_type": "markdown", "id": "3e276f0f", "metadata": {}, "source": [ "\n", "\n", "#### 4. Sloped Baseline" ] }, { "cell_type": "code", "execution_count": 8, "id": "409caaf3", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.852859Z", "iopub.status.busy": "2024-04-26T11:48:05.852859Z", "iopub.status.idle": "2024-04-26T11:48:05.868698Z", "shell.execute_reply": "2024-04-26T11:48:05.867893Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "slope=0.5\n", "intercept=1\n", "\n", "abline=( \n", " ggplot(data, aes('x', 'y')) + \n", " geom_abline(intercept=intercept, slope=slope, color='black', linetype='dotted', size=1.5) + \n", " coord_fixed(ylim=[-12, 12]) \n", " )\n", "\n", "gggrid([\n", " abline + geom_lollipop(intercept=intercept, slope=slope, shape=21) + ggtitle(\"dir='v' (default)\"),\n", " abline + geom_lollipop(intercept=intercept, slope=slope, shape=21, dir=\"h\") + ggtitle(\"dir='h'\"),\n", " abline + geom_lollipop(intercept=intercept, slope=slope, shape=21, dir=\"s\") + ggtitle(\"dir='s'\"),\n", "])\n" ] }, { "cell_type": "markdown", "id": "3bd40d9f", "metadata": {}, "source": [ "\n", "\n", "#### 5. Parameter `stat`" ] }, { "cell_type": "code", "execution_count": 9, "id": "0fc4c215", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:05.868698Z", "iopub.status.busy": "2024-04-26T11:48:05.868698Z", "iopub.status.idle": "2024-04-26T11:48:06.011964Z", "shell.execute_reply": "2024-04-26T11:48:06.010565Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0manufacturermodeldisplyearcyltransdrvctyhwyflclass
01audia41.819994auto(l5)f1829pcompact
12audia41.819994manual(m5)f2129pcompact
23audia42.020084manual(m6)f2031pcompact
\n", "
" ], "text/plain": [ " Unnamed: 0 manufacturer model displ year cyl trans drv cty hwy \\\n", "0 1 audi a4 1.8 1999 4 auto(l5) f 18 29 \n", "1 2 audi a4 1.8 1999 4 manual(m5) f 21 29 \n", "2 3 audi a4 2.0 2008 4 manual(m6) f 20 31 \n", "\n", " fl class \n", "0 p compact \n", "1 p compact \n", "2 p compact " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(\"https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv\")\n", "df.head(3)" ] }, { "cell_type": "code", "execution_count": 10, "id": "341237ee", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:06.011964Z", "iopub.status.busy": "2024-04-26T11:48:06.011964Z", "iopub.status.idle": "2024-04-26T11:48:06.057925Z", "shell.execute_reply": "2024-04-26T11:48:06.057925Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gggrid([\n", " ggplot(df, aes(x=\"class\")) + geom_lollipop(stat='count') + ggtitle(\"stat='count'\"),\n", " ggplot(df, aes(x=\"hwy\")) + geom_lollipop(stat='bin') + ggtitle(\"stat='bin'\"),\n", " ggplot(df, aes(x=\"hwy\")) + geom_lollipop(stat='density', n=30) + ggtitle(\"stat='density'\"),\n", "])" ] }, { "cell_type": "markdown", "id": "932bd637", "metadata": {}, "source": [ "\n", "\n", "#### 6. Lollipops in Marginal Layer" ] }, { "cell_type": "code", "execution_count": 11, "id": "14b8ff38", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:06.059528Z", "iopub.status.busy": "2024-04-26T11:48:06.059528Z", "iopub.status.idle": "2024-04-26T11:48:06.090355Z", "shell.execute_reply": "2024-04-26T11:48:06.090355Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ggplot(df, aes(\"hwy\", \"cty\")) + \\\n", " geom_bin2d(binwidth=[1, 1]) + \\\n", " ggmarginal(\"r\", size=.2, \\\n", " layer=geom_lollipop(aes(color='..count..'), \\\n", " stat='count', orientation='y', size=1))" ] }, { "cell_type": "markdown", "id": "1405210e", "metadata": {}, "source": [ "\n", "\n", "#### 7. Lollipops and a Regression Line" ] }, { "cell_type": "code", "execution_count": 12, "id": "f14b6636", "metadata": { "execution": { "iopub.execute_input": "2024-04-26T11:48:06.090355Z", "iopub.status.busy": "2024-04-26T11:48:06.090355Z", "iopub.status.idle": "2024-04-26T11:48:06.152998Z", "shell.execute_reply": "2024-04-26T11:48:06.152998Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = LinearRegression().fit(df[[\"hwy\"]], df[\"cty\"])\n", "slope, intercept = model.coef_[0], model.intercept_\n", "\n", "ggplot(df, aes(\"hwy\", \"cty\")) + \\\n", " geom_smooth(level=.99) + \\\n", " geom_lollipop(slope=slope, intercept=intercept, \\\n", " size=1.2, shape=21, color=\"black\", fill=\"magenta\") + \\\n", " coord_fixed()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }