{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualising In the Spotlight Data Over Time\n", "\n", "In this notebook we will produce some visualisations of [*In the Spotlight*](https://www.libcrowds.com/collection/playbills) performance data over time to see if we can begin to identify any trends.\n", "\n", "As we begin to get into more complicated territory, we won't explain every function used in detail. However, hopefully there will be something here that most can follow.\n", "\n", "We will again use pandas and plotly as our core Python libraries, both of which were introduced in previous notebooks." ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [], "source": [ "import pandas\n", "import plotly" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The dataset\n", "\n", "Our input will again be the dataframe of performance data introduced in a [previous notebook](intro_to_analysing_its_data_using_python.ipynb). The dataframe is loaded in the code block below." ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/vnd.plotly.v1+html": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import os\n", "import sys\n", "module_path = os.path.abspath(os.path.join('..', 'data', 'scripts'))\n", "if module_path not in sys.path:\n", " sys.path.append(module_path)\n", "from get_its_performances import get_performances_df\n", "df = get_performances_df()\n", "\n", "# Sets plotly to offline mode\n", "plotly.offline.init_notebook_mode()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a reminder of how this dataframe looks we can run the `head()` function." ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titledategenrelinktheatrecitysource
0PageantryNaNNaNhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
1The HypocriteNaNComedyhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
2The PadlockNaNMusical Farcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
3The Village LawyerNaNFarcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
4Death of Gen. WolfeNaNBallethttp://access.bl.uk/item/viewer/ark:/81055/vdc...Theatre Royal, MargateMargatehttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...
\n", "
" ], "text/plain": [ " title date genre \\\n", "0 Pageantry NaN NaN \n", "1 The Hypocrite NaN Comedy \n", "2 The Padlock NaN Musical Farce \n", "3 The Village Lawyer NaN Farce \n", "4 Death of Gen. Wolfe NaN Ballet \n", "\n", " link theatre \\\n", "0 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "1 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "2 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "3 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "4 http://access.bl.uk/item/viewer/ark:/81055/vdc... Theatre Royal, Margate \n", "\n", " city source \n", "0 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "1 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "2 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "3 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... \n", "4 Margate https://api.bl.uk/metadata/iiif/ark:/81055/vdc... " ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding days, months and years to the dataframe\n", "\n", "As we begin looking at our date information more closely it might be useful to add separate columns for day, month and year to our dataframe so that we can plot other entities against these values.\n", "\n", "We will also want to remove any rows that do not contian a date, or contain an incomplete date, as is the case for many of the playbills. The following line of code checks each value in the date column against a regular expression and removes those rows that do not match the pattern that identifies a complete date." ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [], "source": [ "df = df[df.date.str.contains('\\d{4}-\\d{2}-\\d{2}', na=False)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The date column is then converted to a date type." ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [], "source": [ "df['date'] = pandas.to_datetime(df['date'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are now ready to create our additional columns." ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [], "source": [ "df['day'] = df['date'].dt.strftime('%d').astype('int32')\n", "df['month'] = df['date'].dt.strftime('%m').astype('int32')\n", "df['year'] = df['date'].dt.strftime('%Y').astype('int32')" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titledategenrelinktheatrecitysourcedaymonthyear
194Wandering Boys: Or, the Castle of Olival1829-04-30NaNhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Miscellaneous Plymouth theatresPlymouthhttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...3041829
198High Life Below Stairs1828-04-10Farcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Miscellaneous Plymouth theatresPlymouthhttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...1041828
202Jack Robinson and His Monkey1829-01-30NaNhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Miscellaneous Plymouth theatresPlymouthhttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...3011829
205Invincibles; Ou Les Femmes Soldats1829-03-05NaNhttp://access.bl.uk/item/viewer/ark:/81055/vdc...Miscellaneous Plymouth theatresPlymouthhttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...531829
208Devil to Pay1830-11-23Farcehttp://access.bl.uk/item/viewer/ark:/81055/vdc...Miscellaneous Plymouth theatresPlymouthhttps://api.bl.uk/metadata/iiif/ark:/81055/vdc...23111830
\n", "
" ], "text/plain": [ " title date genre \\\n", "194 Wandering Boys: Or, the Castle of Olival 1829-04-30 NaN \n", "198 High Life Below Stairs 1828-04-10 Farce \n", "202 Jack Robinson and His Monkey 1829-01-30 NaN \n", "205 Invincibles; Ou Les Femmes Soldats 1829-03-05 NaN \n", "208 Devil to Pay 1830-11-23 Farce \n", "\n", " link \\\n", "194 http://access.bl.uk/item/viewer/ark:/81055/vdc... \n", "198 http://access.bl.uk/item/viewer/ark:/81055/vdc... \n", "202 http://access.bl.uk/item/viewer/ark:/81055/vdc... \n", "205 http://access.bl.uk/item/viewer/ark:/81055/vdc... \n", "208 http://access.bl.uk/item/viewer/ark:/81055/vdc... \n", "\n", " theatre city \\\n", "194 Miscellaneous Plymouth theatres Plymouth \n", "198 Miscellaneous Plymouth theatres Plymouth \n", "202 Miscellaneous Plymouth theatres Plymouth \n", "205 Miscellaneous Plymouth theatres Plymouth \n", "208 Miscellaneous Plymouth theatres Plymouth \n", "\n", " source day month year \n", "194 https://api.bl.uk/metadata/iiif/ark:/81055/vdc... 30 4 1829 \n", "198 https://api.bl.uk/metadata/iiif/ark:/81055/vdc... 10 4 1828 \n", "202 https://api.bl.uk/metadata/iiif/ark:/81055/vdc... 30 1 1829 \n", "205 https://api.bl.uk/metadata/iiif/ark:/81055/vdc... 5 3 1829 \n", "208 https://api.bl.uk/metadata/iiif/ark:/81055/vdc... 23 11 1830 " ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting most popular periods\n", "\n", "We can now identify the days, months or years where most plays were performed. The following code block plots a chart of plays performed by month of the year." ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "application/vnd.plotly.v1+json": { "data": [ { "type": "scatter", "uid": "8476c412-8951-11e8-b233-3c07545057c8", "x": [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 ], "y": [ 115, 103, 80, 83, 32, 5, 27, 44, 20, 73, 84, 103 ] } ], "layout": { "autosize": true, "xaxis": { "autorange": true, "range": [ 0.30890052356020936, 12.69109947643979 ], "type": "linear" }, "yaxis": { "autorange": true, "range": [ -3.0573248407643305, 123.05732484076432 ], "type": "linear" } } }, "text/html": [ "
" ], "text/vnd.plotly.v1+html": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "date_part = 'month'\n", "series = df[date_part].value_counts()\n", "series.sort_index(inplace=True)\n", "trace = plotly.graph_objs.Scatter(x=series.index, y=series)\n", "fig = plotly.graph_objs.Figure(data=[trace])\n", "plotly.offline.iplot(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that there appears to be a trend towards less performances during the middle of the year. Although, with a relatively small dateset we might want to be careful about attempting to draw any conclusions just yet (trends will become clearer as more data is collected).\n", "\n", "Similar charts for the day or year can be produced by modifying the `date_part` variable above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "Work in progress!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }