# UpSet.js Jupyter Widget

[UpSet.js](https://upset.js.org) is a JavaScript re-implementation of [UpSetR](https://www.rdocumentation.org/packages/UpSetR/) which itself is based on [UpSet](http://vcg.github.io/upset/about/). 
The core library is written in React but provides also bundle editions for plain JavaScript use and this Jupyter wrapper. 

In this tutorial the basic widget functionality is explained.

Let's begin with importing the widget and some utilities

In [1]:
from ipywidgets import interact
from upsetjs_jupyter_widget import UpSetJSWidget
import pandas as pd

This wrapper is implemented in Python 3 with mypy typings and generics. The generic type `T` of the `UpSetJSWidget` is type of element that we wanna handle. In the following example we handle `str` elements.

In [7]:
w = UpSetJSWidget[str]()

## Basic User Interface

**Note**: The input data will be described in more detail in the next section

In [8]:
dict_input = {'one': ['a', 'b', 'c', 'e', 'g', 'h', 'k', 'l', 'm'], 'two': ['a', 'b', 'd', 'e', 'j'], 'three': ['a', 'e', 'f', 'g', 'h', 'i', 'j', 'l', 'm']}
w.from_dict(dict_input)

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'l'…

An UpSet plot consists of three areas:

- The bottom left area shows the list of sets as a vertical bar chart. The length of the bar corresponds to the cardinality of the set, i.e., the number of elements in this set. 
- The top right area shows the list of set intersections as a horiztonal bar chart. Again the length corresponds to the cardinality of the set
- The bottom right area shows which intersection consists of which sets. A dark dot indicates that the set is part of this set intersection. The line connecting the dots is just to visually group the dots.

Moving the mouse over a bar or a dot will automatically highlight the corresponding set or set intersection in orange. In addition, the number elements which are shared with the highlighted sets are also highlighted. This gives a quick overview how sets and set intersections are related to each other. More details, in the [Interaction](#intersection) section.

In the bottom right corner there are two buttons for exporting the chart in either PNG or SVG image format.

## Input Formats

In the current version the UpSet.js wrapper supports three input data formats: `dictionary`, `expression` and through a Pandas `dataframe`.

### Dictionary Input

The first format is a dictionary of type `Dict[str, List[T]]`, `T` refers again to the elements type, in this case it is a list of `str`. The key of the dictionary entry is the set name while the value is the list of elements this set has.

In [9]:
w.from_dict({'one': ['a', 'b', 'c', 'e', 'g', 'h', 'k', 'l', 'm'], 'two': ['a', 'b', 'd', 'e', 'j'], 'three': ['a', 'e', 'f', 'g', 'h', 'i', 'j', 'l', 'm']})

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'h'…

### Expression Input

The second format is a mapping of type `Dict[str,number]`, i.e., it has to have an `.items() -> Iterator[Tuple[str, number]]` method. The key of the dictionary entry is the set combination name while the value is the number of elements in this sets. By default, the `&` is used to split a combination name in its individual sets

In [11]:
w.from_expression({'one': 9, 'two': 5, 'three': 9, 'one&two': 3, 'one&three': 6, 'two&three': 3, 'one&two&three': 2})

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems=set(…

### Data Frame Input

The last format is a a binary/boolean data frame. The index column contains the list of elements. Each regular color represents a sets with boolean values (e.g., 0 and 1) whether the row represented by the index value is part of the set or not.

The following data frame defines the same set structure as the dictionary format before.

In [12]:
df = pd.DataFrame(dict(
    one=[1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1], 
    two=[1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0], 
    three=[1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1]
), index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'])
w.from_dataframe(df)

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'h'…

## Basic Attrributes

`.elems` returns the list of extracted elements

In [13]:
w.elems

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']

`.sets` returns the list of extracted sets as `UpSetSet` objects

In [14]:
w.sets

[UpSetSet(name=one, cardinality=9, elems={'h', 'a', 'm', 'b', 'k', 'c', 'g', 'l', 'e'}),
 UpSetSet(name=three, cardinality=9, elems={'i', 'h', 'f', 'a', 'm', 'j', 'g', 'l', 'e'}),
 UpSetSet(name=two, cardinality=5, elems={'b', 'j', 'd', 'a', 'e'})]

Similariy, `.combinations` returns the list of set intersections that are visualized as `UpSetIntersection` objects. 

**Note**: the attribute is called `.combinations` instead of `.intersections` since one can customize the generation of the set combinations that are visualized. For example, one can also generate set unions.

In [16]:
w.combinations

[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'h', 'a', 'm', 'b', 'k', 'c', 'g', 'l', 'e'}),
 UpSetSetIntersection(name=three, sets={'three'}, cardinality=9, elems={'i', 'h', 'f', 'a', 'm', 'j', 'g', 'l', 'e'}),
 UpSetSetIntersection(name=(one ∩ three), sets={'three', 'one'}, cardinality=6, elems={'m', 'h', 'g', 'a', 'l', 'e'}),
 UpSetSetIntersection(name=two, sets={'two'}, cardinality=5, elems={'b', 'j', 'd', 'a', 'e'}),
 UpSetSetIntersection(name=(one ∩ two), sets={'one', 'two'}, cardinality=3, elems={'b', 'a', 'e'}),
 UpSetSetIntersection(name=(two ∩ three), sets={'three', 'two'}, cardinality=3, elems={'a', 'j', 'e'}),
 UpSetSetIntersection(name=(one ∩ two ∩ three), sets={'three', 'one', 'two'}, cardinality=2, elems={'a', 'e'})]

`.generate_intersections`, `.generate_distinct_intersections`, and `.generate_unions` let you customize the generation of the set combinations

 - `min_degree` ... minimum number of sets in a set combination
 - `max_degree` ... maximum number of sets in a set combination, `None` means no limit
 - `empty` ... include empty set combinations with no elements. By default they are not included
 - `order_by` ... sort set combinations either by `cardinality` (number of elements) or by `degree` (number of sets
 - `limit` ... show only the first `limit` set combinations

In [17]:
w.copy().generate_distinct_intersections()

UpSetJSWidget(value=None, combinations=[UpSetSetDistinctIntersection(name=(one ∩ three), sets={'three', 'one'}…

In [18]:
w.copy().generate_intersections(min_degree=2, max_degree=None, empty=True, order_by="cardinality", limit=None)

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=(one ∩ three), sets={'three', 'one'}, cardin…

In [19]:
w.copy().generate_unions(min_degree=0, max_degree=2, empty=True, order_by="degree", limit=None)

UpSetJSWidget(value=None, combinations=[UpSetSetUnion(name=(), sets=set(), cardinality=13, elems={'i', 'h', 'd…

## Interaction

UpSet.js allows three intersection modes settable via `.mode`

- `'hover'` (default) when the user hovers over a set or set intersection it will be highlighted. This is the default mode
- `'click'` when the user clicks on a set or a set intersection, the selection will be updated
- `'contextMenu'` when the user right clicks on a set or a set intersection, the selection will be updated
- `'static'` disables interactivity

In [20]:
w.mode = 'click'
w

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'h'…

with `.selection` one manually sets the selection that is currently highlighted. Manually setting the selection is only useful in `click` and `static` modes.

In [21]:
w.selection = w.sets[0]
w

UpSetJSWidget(value={'type': 'set', 'name': 'one'}, combinations=[UpSetSetIntersection(name=one, sets={'one'},…

The current selection is synced with the server. It is designed to work with the `interact` of the `ipywidgets` package. In the following example the current selected set will be automatically written below the chart and updated interactivly.

In [22]:
w.mode = 'hover'
def selection_changed(s):
    return s # s["name"] if s else None
interact(selection_changed, s=w)

interactive(children=(UpSetJSWidget(value={'type': 'set', 'name': 'one'}, combinations=[UpSetSetIntersection(n…

<function __main__.selection_changed(s)>

## Queries

besides the selection UpSet.js supports defining queries. A query can be a list of elements or a set that should be highlighted. A query consists of a name, a color, and either the list of elements or the set (combination) to highlight.

In [23]:
wq = w.copy()
wq.mode = 'static'
wq.selection = None
wq.append_query('Q1', color='red', elems=['a', 'b', 'c'])
wq.append_query('Q1', color='blue', upset=wq.sets[1])
wq

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'h'…

## Attributes

UpSet.js supports rendering boxplots as aggregations for numerical attributes and mosaic plots for categorical attributes of elements. The are given as part of the data frame. The attributes element can either have a list of column names or a data frame with the same index

In [25]:
df_a = pd.DataFrame(dict(
    one=[1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1], 
    two=[1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0], 
    three=[1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1],
    attr=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
), index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'])
wa = w.copy()
wa.from_dataframe(df_a, attributes=['attr'])
wa

UpSetJSWidget(value=None, attrs=[<upsetjs_jupyter_widget._model.UpSetAttribute object at 0x000001B561E077F0>],…

In [4]:
df_c = pd.DataFrame(dict(
    one=[1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1], 
    two=[1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0], 
    three=[1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1],
    attr=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
    attr2=['a', 'a', 'b', 'b', 'b', 'a', 'b', 'a', 'a', 'b', 'b', 'b', 'a']
), index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'])
wa = w.copy()
wa.from_dataframe(df_c, attributes=['attr', 'attr2'])
wa

UpSetJSWidget(value=None, attrs=[<upsetjs_jupyter_widget._model.UpSetAttribute object at 0x00000187E06E3D68>, …

## Styling

### Theme

UpSet.js supports three themes: `light`, `dark`, and `vega`. The theme can be set by the `.theme` property. Besides themes one can customize various aspects of the style, see also https://upset.js.org/api/jupyter/ for the API doc

In [9]:
w_dark = w.copy()
w_dark.theme = 'dark'
w_dark

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, cardinality=9, elems={'l'…

### Labels

In [21]:
w_label = w.copy()
w_label.title = 'Chart Title'
w_label.description = 'a long chart description'
w_label.set_name = 'Set Label'
w_label.combination_name = 'Combination Label'
w_label

UpSetJSWidget(value=None, combination_name='Combination Label', combinations=[UpSetSetIntersection(name=one, s…

### Log Scale

setting `.numerical_scale = 'log'` switches to a log scale, similarly `'linear'` goes back to a linear scale

In [22]:
w_log = w.copy()
w_log.numeric_scale = 'log'
w_log

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, elems={'m', 'l', 'g', 'c'…

### Size

the `.width` and `.height` properties can be used to specify the width and height of the chart respectively. In general, the `.layout` of the Jupyter Widgets can be used to customize it.

In [23]:
w.height = 600
w

UpSetJSWidget(value=None, combinations=[UpSetSetIntersection(name=one, sets={'one'}, elems={'m', 'l', 'g', 'c'…