{ "cells": [ { "cell_type": "markdown", "id": "f8c32ab5", "metadata": {}, "source": [ "Let's dig into waterfall plots in plotly, and see how much we can customize this" ] }, { "cell_type": "code", "execution_count": 1, "id": "be8f4a63", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import plotly.graph_objects as go\n", "\n", "import plotly.io as pio\n", "pio.renderers.default = \"notebook_connected\"" ] }, { "cell_type": "markdown", "id": "9763c551", "metadata": {}, "source": [ "Start with the base example from the documentation:" ] }, { "cell_type": "code", "execution_count": 2, "id": "65a1a053", "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure(go.Waterfall(\n", " measure = [\"relative\", \"relative\", \"total\", \"relative\", \"relative\", \"total\"],\n", " x = [\"Sales\", \"Consulting\", \"Net revenue\", \"Purchases\", \"Other expenses\", \"Profit before tax\"],\n", " y = [60, 80, 0, -40, -20, 0],\n", " textposition = \"outside\",\n", " text = [\"▲60\", \"▲80\", \"\", \"▼40\", \"▼20\", \"Total\"],\n", " connector = {\"line\":{\"color\":\"grey\"}},\n", "))\n", "\n", "fig.update_layout(title = \"Profit and loss statement 2018\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "a5417e54", "metadata": {}, "source": [ "Let's break this down a little, starting with an even more simple example into the three most important building blocks:\n", "\n", "* `measure`: Options: `'relative'`, `'total'`, or `'absolute'`\n", " * `'total'` compute the sums\n", " * `'absolute'` resets the computed total or use as an initial value\n", "* `x`: Since this is a horizontal bar waterfall chart, x is a list of 'strings' that represent the categories\n", "* `y`: Where the `measure` is relative, this needs to be a change. \n", "* `text`: The label for each bar in the waterfall chart." ] }, { "cell_type": "markdown", "id": "f1b532dd", "metadata": {}, "source": [ "Notice the bars are automatically colored based on `y` values and the `measure` values. The `measure` default is `relative`, and having an `absolute` item would be just for instances in which you want to reset the aggregation or if you want to start with a number. \n", "\n", "### Here's an example of using `absolute` to walk from \"Net revenue\" to \"Profit before tax\":" ] }, { "cell_type": "code", "execution_count": 3, "id": "f5bb5e52", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure(go.Waterfall(\n", " measure = [\"absolute\", \"relative\", \"relative\", \"total\"],\n", " x = [\"Net revenue\", \"Purchases\", \"Other expenses\", \"Profit before tax\"],\n", " y = [140, -40, -20, 0],\n", "))\n", "\n", "fig.update_layout(title='measure = [\"absolute\", \"relative\", \"relative\", \"total\"]
y=[140, -40, -20, 0]')\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "a86b239b", "metadata": {}, "source": [ "Notice that we now need to set a value for `y` for `Net revenue` because we set the `measure` to \"absolute\". If we left it at 0 or set it to \"total\" it would just be 0." ] }, { "cell_type": "markdown", "id": "72342470", "metadata": {}, "source": [ "# A Finance Example\n", "\n", "Let's do a simple example of a year over year operating margin walk. Let's imagine we had data that was just *perfectly* shaped, like:" ] }, { "cell_type": "code", "execution_count": 4, "id": "db2c426a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
measurexytext
0absolute2020 OM80$80M
1relativeRevenue10▲$10M
2relativeCost of Sales-5▼$5M
3relativeSpending-20▼$20M
4total2021 OM100$65M
\n", "
" ], "text/plain": [ " measure x y text\n", "0 absolute 2020 OM 80 $80M\n", "1 relative Revenue 10 ▲$10M\n", "2 relative Cost of Sales -5 ▼$5M\n", "3 relative Spending -20 ▼$20M\n", "4 total 2021 OM 100 $65M" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(data={\"measure\": [\"absolute\", \"relative\", \"relative\", \"relative\", \"total\"],\n", " \"x\":[\"2020 OM\", \"Revenue\", \"Cost of Sales\", \"Spending\", \"2021 OM\"],\n", " \"y\" : [80, 10, -5, -20, 100],\n", " \"text\" : [\"$80M\", \"▲$10M\", \"▼$5M\", \"▼$20M\", \"$65M\"]})\n", "df" ] }, { "cell_type": "markdown", "id": "ea774286", "metadata": {}, "source": [ "Then we could construct our waterfall as follows:" ] }, { "cell_type": "code", "execution_count": 5, "id": "b3c66931", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure(go.Waterfall(\n", " measure = df[\"measure\"],\n", " x = df[\"x\"],\n", " y = df[\"y\"],\n", " textposition = \"outside\",\n", " text = df[\"text\"],\n", " connector = {\"line\":{\"color\":\"grey\"}},\n", "))\n", "\n", "fig.update_layout(title = \"2020-2021 Operating Margin\")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "428f0911", "metadata": {}, "source": [ "The above waterfall shows how changes in revenue, costs, and spending drove a decrease in operating margin from 2020 to 2021. Reading from left to right:\n", "\n", "* 2020 Operating Margin (OM) was \\\\$80M\n", "* Revenue increased by \\\\$10M\n", "* Cost of Sales increased by \\\\$5M compared to last year\n", "* Spending increased by \\\\$20M from the previous year\n", "* The result of these changes is a \\\\$65M Operating Margin for 2021." ] }, { "cell_type": "markdown", "id": "172425a4", "metadata": {}, "source": [ "## Getting the data in just the right format is left as an exercise for the reader\n", "\n", "We can already see two things that are going to be annoying: \n", "\n", "1. Formatting a dataframe to get all of these values in *just* the right shape for this\n", "2. Automatically generating the `text` labels based on the values rather than explicitly typing them in. \n", "\n", "Plotly helps with the rendering of the SVG, everything else is up to you! Still the best option for making a waterfall chart in Python that I've seen." ] }, { "cell_type": "markdown", "id": "5547fe98", "metadata": {}, "source": [ "# Customization\n", "\n", "Plotly comes with a bunch of templates built in that are a great starting point. We'll go with the simple version:" ] }, { "cell_type": "code", "execution_count": 6, "id": "36269900", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = go.Figure(go.Waterfall(\n", " measure = df[\"measure\"],\n", " x = df[\"x\"],\n", " y = df[\"y\"],\n", " textposition = \"outside\",\n", " text = df[\"text\"],\n", " connector = {\"line\":{\"color\":\"grey\"}},\n", "))\n", "\n", "fig.update_layout(title = \"2020-2021 Operating Margin
(template = simple_white)\",\n", " template='simple_white')\n", "\n", "fig.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 5 }