{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# An Introduction to Visualising In the Spotlight Data Using Python\n", "\n", "In this notebook we will introduce a way of producing visualisations of [*In the Spotlight*](https://www.libcrowds.com/collection/playbills) results data, using Python.\n", "\n", "[Plotly.py](https://plot.ly/d3-js-for-python-and-pandas-charts/) is a Python graphing library that can be used to produce over 30 chart types that can viewed in Jupyter notebooks. We will use it here to produce pie and bar charts. In future notebooks we may go on to explore some more complex chart types.\n", "\n", "We begin by importing the required Python libraries, pandas and plotly." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "import pandas\n", "import plotly" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The dataset\n", "\n", "Our input will be the dataframe of performance data introduced in a [previous notebook](intro_to_analysing_its_data_using_python.ipynb). Again, all we need to know about the code block below is that it loads our dataframe of performance data." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import os\n", "import sys\n", "module_path = os.path.abspath(os.path.join('..', 'data', 'scripts'))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", "from get_its_performances import get_performances_df\n", "df = get_performances_df()\n", "\n", "# Sets plotly to offline mode\n", "plotly.offline.init_notebook_mode()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a reminder of how this dataframe looks we can run the `head()` function." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titledategenrelinktheatrecitysource
0PageantryNaNNaNhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
1The HypocriteNaNComedyhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
2The PadlockNaNMusical Farcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
3The Village LawyerNaNFarcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
4Death of Gen. WolfeNaNBallethttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
\n", "
" ], "text/plain": [ " title date genre \\\n", "0 Pageantry NaN NaN \n", "1 The Hypocrite NaN Comedy \n", "2 The Padlock NaN Musical Farce \n", "3 The Village Lawyer NaN Farce \n", "4 Death of Gen. Wolfe NaN Ballet \n", "\n", " link theatre \\\n", "0 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "1 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "2 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "3 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "4 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "\n", " city source \n", "0 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "1 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "2 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "3 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "4 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pie charts\n", "\n", "Pie charts are perhaps one of the most straightforward types of visualisation to get started with, all we need are a list of unique labels against the a list of counts for those labels. We can get these by using the `value_counts()` method, which was introduced in an [earlier notebook](intro_to_analysing_its_data_using_python.ipynb).\n", "\n", "The `entity` variable defined below identifies the column that we are counting." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "entity = 'genre'\n", "series = df[entity].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output of the `value_counts()` function is a pandas [Series](https://pandas.pydata.org/pandas-docs/stable/dsintro.html#series), which is a one-dimensional labeled array capable of holding any data type. As with a dataframe, we can also use the `head()` function with a series to display a quick snapshot of the data." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Farce 221\n", "Comedy 218\n", "Tragedy 81\n", "Drama 77\n", "Play 66\n", "Name: genre, dtype: int64" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "series.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now define the labels and values to be used for our chart. Below, a limit of 10 is defined, before taking that number of rows and setting these as our *labels* and *values*." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "limit = 10\n", "labels = series[:limit].index.tolist()\n", "values = series[:limit].tolist()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Plotly chart can then be generated and displayed using the code below." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "labels": [ "Farce", "Comedy", "Tragedy", "Drama", "Play", "Musical Farce", "Interlude", "Melo-Drame", "Melo-Drama", "Melodrama" ], "type": "pie", "uid": "7c0db34c-8951-11e8-979e-3c07545057c8", "values": [ 221, 218, 81, 77, 66, 48, 39, 32, 28, 23 ] } ], "layout": { "autosize": true } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "trace = plotly.date_part = plotly.graph_objs.Pie(labels=labels, values=values)\n", "fig = plotly.graph_objs.Figure(data=[trace])\n", "plotly.offline.iplot(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are many options available in the plotly library for styling these charts, such as hiding the legend or displaying additional information when hovering over particular areas. These options are probably a little too much to get into here but more details can be found in the [plotly documentation](https://plot.ly/).\n", "\n", "To see the chart for a different column, such as title, you can try modifying the `entity` variable above. Note that the percentages shown are of the slice of data defined by our specified `limit`, rather than of the whole dataset." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bar charts\n", "\n", "Bar charts can be produced with very similar code to the pie chart generated above. Again, we just need a list of labels and a list of values. In fact, we will produce our first chart using the labels and values already defined above." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "type": "bar", "uid": "7c24e85c-8951-11e8-8b35-3c07545057c8", "x": [ "Farce", "Comedy", "Tragedy", "Drama", "Play", "Musical Farce", "Interlude", "Melo-Drame", "Melo-Drama", "Melodrama" ], "y": [ 221, 218, 81, 77, 66, 48, 39, 32, 28, 23 ] } ], "layout": { "autosize": true, "xaxis": { "autorange": true, "range": [ -0.5, 9.5 ], "type": "category" }, "yaxis": { "autorange": true, "range": [ 0, 232.6315789473684 ], "type": "linear" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "trace = plotly.graph_objs.Bar(x=labels, y=values)\n", "fig = plotly.graph_objs.Figure(data=[trace])\n", "plotly.offline.iplot(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, you can generate a chart for a different column by assigning a different value to the `entity` variable above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "In this notebook we began visualising our perfomance data using Python." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }