{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Lets-Plot Usage Guide\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"- [System requirements](#sys)\n",
"- [Installation](#install)\n",
"- [Understanding architecture](#implementation)\n",
"- [Learning API](#api)\n",
"- [Getting started](#gsg)\n",
"\n",
"\n",
"**Lets-Plot** is an open-source plotting library for statistical data. It is implemented using the \n",
"[Kotlin programming language](https://kotlinlang.org/) that has a multi-platform nature.\n",
"That's why Lets-Plot provides the plotting functionality that \n",
"is packaged as a JavaScript library, a JVM library, and a native Python extension.\n",
"\n",
"The design of the Lets-Plot library is heavily influenced by [ggplot2](https://ggplot2.tidyverse.org) library.\n",
"\n",
"\n",
"## Installation\n",
"\n",
"Library is distributed via [Maven Repository](https://bintray.com/jetbrains/lets-plot-maven/lets-plot-kotlin-jars).\n",
"You can include it in your Kotlin or Java project using Maven or Gradle configuration files (see also [Developer guide](https://github.com/JetBrains/lets-plot-kotlin/blob/master/USAGE_SWING_JFX_JS.md)),\n",
"or include it in your Jupyter notebook script via `%use lets-plot` annotation (see [Kotlin kernel for IPython/Jupyter](https://github.com/Kotlin/kotlin-jupyter)).\n",
"\n",
"\n",
"## Understanding Lets-Plot architecture\n",
"In `lets-plot`, the **plot** is represented at least by one\n",
"**layer**. It can be built based on the default dataset with the aesthetics mappings, set of scales, or additional \n",
"features applied.\n",
"\n",
"The **Layer** is responsible for creating the objects painted on the ‘canvas’ and it contains the following elements:\n",
"- **Data** - the set of data specified either once for all layers or on a per layer basis.\n",
"One plot can combine multiple different datasets (one per layer).\n",
"- **Aesthetic mapping** - describes how variables in the dataset are mapped to the visual properties of the layer, such as color, shape, size, or position.\n",
"- **Geometric object** - a geometric object that represents a particular type of charts.\n",
"- **Statistical transformation** - computes some kind of statistical summary on the raw input data. \n",
"For example, `bin` statistics is used for histograms and `smooth` is used for regression lines. \n",
"Most stats take additional parameters to specify details of the statistical transformation of data.\n",
"- **Position adjustment** - a method used to compute the final coordinates of geometry. \n",
"Used to build variants of the same `geom` object or to avoid overplotting.\n",
"\n",
"![layer diagram](images/layer-small.png)\n",
"\n",
"\n",
"## Learning API\n",
"The typical code fragment that plots a Lets-Plot chart looks as follows:\n",
"\n",
"```\n",
"import org.jetbrains.letsPlot.*\n",
"import org.jetbrains.letsPlot.geom.*\n",
"import org.jetbrains.letsPlot.stat.*\n",
"\n",
"p = letsPlot() \n",
"p + geom(stat=, position=) { }\n",
"```\n",
"\n",
"### Geometric objects `geom`\n",
"\n",
"You can add a new geometric object (or plot layer) by creating it using the `geomXxx()` function and then adding this object to `ggplot`:\n",
"\n",
"```\n",
"p = letsPlot(data=df)\n",
"p + geomPoint()\n",
"```\n",
"\n",
"See the [geom reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.geom/index.html) for more information about the supported\n",
"geometric objects, their arguments, and default values.\n",
"\n",
"There is also a few `statXxx()` functions which also create a plot layer.\n",
"Occasionally, it feels more naturally to use `statXxx()` instead of `geomXxx()` function to add a new plot layer.\n",
"For example, you might prefer to use `statCount()` instead of `geomBar()`.\n",
"\n",
"See the [stat layer reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.stat/index.html) for more information about the supported\n",
"stat plot-layer objects, their arguments, and default values.\n",
"\n",
"\n",
"### Collections of plots\n",
"With the [GGBunch()](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot/-g-g-bunch/index.html) object, you can\n",
"render a collection of plots.\n",
"Use the `addPlot()` method to add a plot to the bunch and set an arbitrary location and size for plots inside the grid:\n",
"\n",
"```\n",
"bunch = GGBunch()\n",
" .addPlot(plot1, 0, 0)\n",
" .addPlot(plot2, 0, 200)\n",
"bunch.show()\n",
"```\n",
"\n",
"See the [ggbunch.ipynb](https://nbviewer.jupyter.org/github/JetBrains/lets-plot-kotlin/blob/master/docs/examples/jupyter-notebooks/ggbunch.ipynb)\n",
" example for more information.\n",
"\n",
"### Stat `stat`\n",
"\n",
"Add `stat` as an argument to `geomXxx()` function to define statistical data transformations:\n",
"\n",
"`geomPoint(stat=Stat.count())`\n",
"\n",
"Supported statistical transformations:\n",
"\n",
"- `identity`: leave the data unchanged\n",
"- `count`: calculate the number of points with same x-axis coordinate\n",
"- `bin`: calculate the number of points falling in each of adjacent equally sized ranges along the x-axis\n",
"- `bin2d`: calculate the number of points falling in each of adjacent equal sized rectangles on the plot plane\n",
"- `smooth`: perform smoothing\n",
"- `contour`, `contourFilled`, : calculate contours of 3D data\n",
"- `boxplot`: calculate components of a box plot.\n",
"- `density`, `density2D`, `density2DFilled`: perform a kernel density estimation for 1D and 2D data\n",
"\n",
"See the [stat reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot/-stat/index.html) for more information about the supported\n",
"stat objects, their arguments, and default values.\n",
"\n",
"\n",
"### Aesthetic mappings `mapping`\n",
"With mappings, you can define how variables in dataset are mapped to the visual elements of the chart.\n",
"Add the `{x=< >; y=< >; ...}` closure to `geom`, where:\n",
"- `x`: the dataframe column to map to the x axis. \n",
"- `y`: the dataframe column to map to the y axis.\n",
"- `...`: other visual properties of the chart, such as color, shape, size, or position.\n",
"\n",
"`geom_point() {x = \"cty\"; y = \"hwy\"; color=\"cyl\"}`\n",
"\n",
"### Position adjustment `position`\n",
"\n",
"All layers have a position adjustment that computes the final coordinates of geometry.\n",
"Position adjustment is used to build variances of the same plots and resolve overlapping.\n",
"Override the default settings by using the `position` argument in the `geom` functions:\n",
"\n",
"`geomBar(position=positionFill)`\n",
"\n",
"or\n",
"\n",
"`geomBar(position=positionDodge(width=1.01))`\n",
"\n",
"Available adjustments:\n",
"- `dodge`\n",
"- `jitter`\n",
"- `jitterdodge`\n",
"- `nudge`\n",
"- `identity`\n",
"- `fill`\n",
"- `stack`\n",
"\n",
"See [position functions reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.pos/index.html)\n",
"for more information about position adjustments.\n",
"\n",
"### Features affecting the entire plot\n",
"\n",
"#### Scales\n",
"\n",
"Enables choosing a reasonable scale for each mapped variable depending on the variable attributes. Override default scales to tweak\n",
"details like the axis labels or legend keys, or to use a completely different translation from data to aesthetic.\n",
"For example, to override the fill color on the histogram:\n",
"\n",
"`p + geomHistogram() + scaleFillContinuous(\"red\", \"green\")`\n",
"\n",
"See the list of the available `scale` methods in the [scale reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.scale/index.html)\n",
"\n",
"#### Coordinated system\n",
"\n",
"The coordinate system determines how the x and y aesthetics combine to position elements in the plot.\n",
"For example, to override the default X and Y ratio:\n",
"\n",
"`p + coordFixed(ratio=2)`\n",
"\n",
"See the list of the available methods in [coordinates reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.coord/index.html)\n",
"\n",
"#### Legend\n",
"The axes and legends help users interpret plots.\n",
"Use the `guide` methods or the `guide` argument of the `scale` method to customize the legend.\n",
"For example, to define the number of columns in the legend:\n",
"\n",
"`p + scaleColorDiscrete(guide=guideLegend(ncol=2))`\n",
"\n",
"See more information about the `guideColorbar, guideLegend` functions in the [scale reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.scale/index.html)\n",
"\n",
"Adjust legend location on plot using the `theme` legendPosition, legendJustification and legendDirection methods, see:\n",
"[theme reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.themes/index.html)\n",
"\n",
"#### Sampling\n",
"\n",
"Sampling is a special technique of data transformation built into Lets-Plot and it is applied after stat transformation.\n",
"Sampling helps prevents UI freezes and out-of-memory crashes when attempting to plot an excessively large number of geometries.\n",
"By default, the technique applies automatically when the data volume exceeds a certain threshold.\n",
"The `samplingNone` value disables any sampling for the given layer. The sampling methods can be chained together using the + operator.\n",
"\n",
"Available methods:\n",
"- `samplingRandomStratified`: randomly selects points from each group proportionally to the group size but also ensures\n",
"that each group is represented by at least a specified minimum number of points.\n",
"- `samplingRandom`: selects data points at randomly chosen indices without replacement.\n",
"- `samplingPick`: analyses X-values and selects all points which X-values get in the set of first `n` X-values found in the population.\n",
"- `samplingSystematic`: selects data points at evenly distributed indices.\n",
"- `samplingCertexDP`, `samplingVertexVW`: simplifies plotting of polygons.\n",
"There is a choice of two implementation algorithms: Douglas-Peucker (`DP`) and\n",
"Visvalingam-Whyatt (`VW`).\n",
"\n",
"For more details, see the [sampling reference](https://lets-plot.org/kotlin/-lets--plot--kotlin/org.jetbrains.letsPlot.sampling/index.html)."
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"\n",
"### Getting started\n",
"\n",
"Let's plot a point chart built using the mpg dataset.\n",
"\n",
"Create the `DataFrame` object and retrieve the data."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"is_executing": false,
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/html": [
" \n",
" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%useLatestDescriptors\n",
"%use lets-plot\n",
"@file:DependsOn(\"com.github.doyaaaaaken:kotlin-csv-jvm:0.7.3\")\n",
"\n",
"import com.github.doyaaaaaken.kotlincsv.client.*\n",
"\n",
"val csvData = java.io.File(\"mpg.csv\")\n",
"\n",
"val mpg: List