# ipyannotate

Widget for Jupyter Notebook to check and annotate data. A simplified version of <a href="https://prodi.gy/demo?view_id=ner">Prodigy</a> for Jupyter Notebook.
<img src="i/screencast.gif"/>

Examples of markup tasks that can be solved using `ipyannotate`:
- Assigning one or more topics to a group of news articles
- Filtering out toxic reviews from your dataset
- Marking adult content from a set of screenshots
- Given a NER markup, count the number of errors grouped by type

Examples of tasks for which `ipyannotate` is not suitable:
- Drawing masks or selecting background vs foreground
- Given a set of portraits, detecting eyes/ears/nose/etc
- Other types of tasks which involved graphically annotating an image
- Given a text, markup spans with names, locations, organizations.

`ipyannotate` is well-suited for conveniently placing one or several labels for a set of objects (pictures, texts, anything that Jupyter can display).

There is a number of tools that do the same thing: <a href="https://prodi.gy/">Prodigy</a>, <a href="http://toloka.yandex.ru/">Yandex.Toloka</a>, for example. `ipyannotate` is a Jupyter widget, and there are three reasons why it is convenient:

1. Input, output

Often the whole process of working with data occurs inside Jupyter Notebook. It is inconvenient to format the data in a json or xml file, send it to another service, upload and parse the results. With `ipyannotate` you can mark up the data directly in Jupyter:

In [2]:
from ipyannotate import annotate
from ipyannotate.buttons import (
    ValueButton as Button,
    NextButton as Next,
    BackButton as Back
)

data = [1, 15, 62, 33, 83, 12, 949, 71]
annotation = annotate(data, buttons=[Button('even'), Button('odd'), Back(), Next()])
annotation

<div align="right">Animation recorded with <a href="http://recordit.co">RecordIt</a></div>
<img src="i/odd_even.gif"/>

In [3]:
annotation.tasks

[Task(output=1, value=odd),
 Task(output=15, value=odd),
 Task(output=62, value=even),
 Task(output=33, value=odd),
 Task(output=83, value=odd),
 Task(output=12, value=even),
 Task(output=949, value=odd),
 Task(output=71, value=odd)]

2. Data Presentation

Jupyter has a convenient data presentation system. If the result of the command is a graph, the user sees the picture, not `<Figure # 881dabb ...>` like in a regular console. This functionality is available in `ipyannotate`, for example, you can conveniently mark out the images:

In [4]:
from glob import glob
from PIL import Image

data = [Image.open(_) for _ in glob('i/dogs_cats/*.jpg')]
annotation = annotate(data, buttons=[Button('dog'), Button('cat'), Back(), Next()])
annotation

Annotation(canvas=OutputCanvas(), progress=Progress(atoms=[<ipyannotate.progress.Atom object at 0x10b37f240>, ‚Ä¶

<img src="i/dogs_cats.gif">

3. Interactivity

When marking texts by category, all possible topics are often not known in advance. Roughly speaking, before starting work, you cannot set a list of buttons in `ipyannotate.annotate`. Widgets are interactive, so you can add a button on the go. Also, for example, intermediate markup results can be stored, `annotation.tasks` is updated in memory interactively.

In [4]:
data = [
    'tours to turkey',
    'lake garda september',
    'holets at black sea',
    'Smartline Konaktepe Hotel 4*',
    'amsterdam where to go',
    'hotel dream in a forest',
    'kabardinka where to go',
    'dubovichi restaurant',
    'dali museum'
]

buttons = [
    Button('place'),
    Button('hotel')
]
controls = [
    Back(),
    Next()
]
annotation = annotate(data, buttons=buttons + controls)
annotation

In [5]:
fun = Button('fun')
annotation.toolbar.buttons = buttons + [fun] + controls

<img src="i/interactive.gif"/>

## Overview

The main user interface is the `annotate` function, it has one required argument` tasks` ‚Äî a list of tasks, and three optional: `buttons`,` display` and `multi`

In [7]:
data = range(10)

annotate(data, buttons=None, display=None, multi=False)

<img src="i/default.gif"/>

By default, `buttons` is a list of four: "ok", "err", "back", "next". This set is suitable for simple verification of the results of the work of any classifier. The user can specify his own buttons, customize colors and shortcuts:

In [8]:
from ipyannotate.buttons import (
    ValueButton as Button,
    BackButton as Back,
    NextButton as Next
)

buttons = [
    Button(2, label='divisible by 2', color='blue', icon='¬Ω ', shortcut='1'),
    Button(3, label='by 3', color='red', icon='‚Öì ', shortcut='2'),
    Button(5, label='by 5', color='green',  icon='‚Öï ', shortcut='3'),
]
controls = [
    Back(),
    Next()
]
annotate(data, buttons=buttons + controls)

When `multi = False`, the widget is in radiobutton mode, the user can specify only one label for the object. If `multi = True` you can specify several labels:

In [9]:
buttons = [
    Button(2, label='divisible by 2', color='blue', icon='¬Ω ', shortcut='1'),
    Button(3, label='by 3', color='red', icon='‚Öì ', shortcut='2'),
    Button(5, label='by 5', color='green',  icon='‚Öï ', shortcut='3'),
]
controls = [
    Back(),
    Next()
]
annotation = annotate(data, buttons=buttons + controls, multi=True)
annotation

<img src="i/multi.gif"/>

If `multi=True` then the `value` of `annotation.tasks` will be `set`:

In [10]:
annotation.tasks

[MultiTask(output=0, value={2, 3, 5}),
 MultiTask(output=1, value=set()),
 MultiTask(output=2, value={2}),
 MultiTask(output=3, value={3}),
 MultiTask(output=4, value={2}),
 MultiTask(output=5, value={5}),
 MultiTask(output=6, value={2, 3}),
 MultiTask(output=7, value=set()),
 MultiTask(output=8, value={2}),
 MultiTask(output=9, value={3})]

By default, the standard `IPython.display` is used for output. The user can specify his function, for example, display the path to the file with the picture:

In [12]:
data = {_: Image.open(_) for _ in glob('i/dogs_cats/*.jpg')}


def display_item(item):
    from IPython.display import display
    
    path, image = item
    print(path)
    display(image)


buttons = [
    Button('dog', shortcut='1'),
    Button('cat', shortcut='2'),
    Back(),
    Next()
]
annotation = annotate(data.items(), buttons=buttons, display=display_item)
annotation

Annotation(canvas=OutputCanvas(), progress=Progress(atoms=[<ipyannotate.progress.Atom object at 0x10b48d518>, ‚Ä¶

<img src="i/display.gif">

See <a href="https://github.com/natasha/ipyannotate-examples">ipyannotate-examples</a> repo for more examples.