{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import altair as alt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're going to start by constructing a data frame with two streams of samples: the \"slow\" samples grow by one for every timestamp step, and the \"fast\" samples grow by nineteen for every timestamp step. Both streams of samples will grow more quickly (but at the same proportions) in last third of our of our dataset, with a large jump in both." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slow = list(range(1,100)) + [x * 2 for x in range(101,150)]\n", "fast = [x * 19 for x in slow]\n", "kinds = ([\"slow\"] * len(slow)) + ([\"fast\"] * len(fast))\n", "indices = list(range(len(slow))) * 2\n", "data = pd.DataFrame({\"timestamp\": indices,\"sample\": slow + fast, \"kind\": kinds})\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data.sample(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we plot our data on a linear scale, it isn't obvious that both streams of samples are always growing at the same rate (and, thus, that the proportion of fast to slow quantities never changes):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "alt.Chart(data).mark_line().encode(alt.X(\"timestamp\"), \n", " alt.Y(\"sample\"), \n", " color=\"kind\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On a log scale, on the other hand, the slope between two points always corresponds to the rate of change between those points. Put another way, doubling a quantity means moving a point by the same amount of vertical space no matter how large or small it is. Since the slopes of these lines are the same, it is obvious that both are growing proportionally -- the vertical distance between both lines is constant." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "alt.Chart(data).mark_line().encode(alt.X(\"timestamp\"), \n", " alt.Y(\"sample\", scale=alt.Scale(type=\"log\")), \n", " color=\"kind\").interactive()\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see what happens when the rates of change change over time." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "rates = [(1 + x) / (len(slow) / 6) for x in range(len(slow))]\n", "slow2 = list(np.multiply(slow, rates))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temp = pd.DataFrame({\"timestamp\": list(range(len(slow))), \n", " \"kind\": [\"less-slow\"] * len(slow),\n", " \"sample\": slow2})\n", "\n", "data2 = pd.concat([temp, data[data[\"kind\"] == \"fast\"]])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we plot the `fast` and `less-slow` series with a linear Y axis, it isn't particularly obvious that the proportion of each kind of sample changes over time even though the less-slow samples are growing five times as fast at the right side of the graph as they were on the left." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "alt.Chart(data2).mark_line().encode(alt.X(\"timestamp\"), \n", " alt.Y(\"sample\"), \n", " color=\"kind\")\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we use a log scale, on the other hand, we can see that the slopes diverge over time because the lines are closer to one another on the right side of the graph than they were on the left side of the graph." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "alt.Chart(data2).mark_line().encode(alt.X(\"timestamp\"), \n", " alt.Y(\"sample\", scale=alt.Scale(type=\"log\")), \n", " color=\"kind\").interactive()\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 4 }