{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Gathering historical data about the addition of newspaper titles to Trove\n",
    "\n",
    "The number of digitised newspapers available through Trove has increased dramatically since 2009. Understanding when newspapers were added is important for historiographical purposes, but there's no data about this available directly from Trove. This notebook uses web archives to extract lists of newspapers in Trove over time, and chart Trove's development.\n",
    "\n",
    "Trove has always provided a browseable list of digitised newspaper titles. The url and format of this list has changed over time, but it's possible to find captures of this page in the Internet Archive and extract the full list of titles. The pages are also captured in the Australian Web Archive, but the Wayback Machine has a more detailed record.\n",
    "\n",
    "The pages that I'm looking for are:\n",
    "\n",
    "* [http://trove.nla.gov.au/ndp/del/titles](https://web.archive.org/web/*/http://trove.nla.gov.au/ndp/del/titles)\n",
    "* [https://trove.nla.gov.au/newspaper/about](https://web.archive.org/web/*/https://trove.nla.gov.au/newspaper/about)\n",
    "\n",
    "This notebook creates the following data files:\n",
    "\n",
    "* [trove_newspaper_titles_2009_2021.csv](https://github.com/GLAM-Workbench/trove-newspapers/blob/master/trove_newspaper_titles_2009_2021.csv) – complete dataset of captures and titles\n",
    "* [trove_newspaper_titles_first_appearance_2009_2021.csv](https://github.com/GLAM-Workbench/trove-newspapers/blob/master/trove_newspaper_titles_first_appearance_2009_2021.csv) – filtered dataset, showing only the first appearance of each title / place / date range combination\n",
    "\n",
    "I've also created a [browseable list of titles](https://gist.github.com/wragge/7d80507c3e7957e271c572b8f664031a), showing when they first appeared in Trove."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "import json\n",
    "import re\n",
    "from surt import surt\n",
    "from bs4 import BeautifulSoup\n",
    "import arrow\n",
    "import pandas as pd\n",
    "import altair as alt\n",
    "from IPython.display import display, HTML\n",
    "from pathlib import Path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Code for harvesting web archive captures\n",
    "\n",
    "We're using the Memento protocol to get a list of captures. See the [Web Archives section](https://glam-workbench.net/web-archives/) of the GLAM Workbench for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The code in this cell is copied from notebooks in the Web Archives section of the GLAM Workbench (https://glam-workbench.net/web-archives/)\n",
    "# In particular see: https://glam-workbench.net/web-archives/#find-all-the-archived-versions-of-a-web-page\n",
    "\n",
    "# These are the repositories we'll be using\n",
    "TIMEGATES = {\n",
    "    'awa': 'https://web.archive.org.au/awa/',\n",
    "    'nzwa': 'https://ndhadeliver.natlib.govt.nz/webarchive/wayback/',\n",
    "    'ukwa': 'https://www.webarchive.org.uk/wayback/en/archive/',\n",
    "    'ia': 'https://web.archive.org/web/'\n",
    "}\n",
    "\n",
    "def convert_lists_to_dicts(results):\n",
    "    '''\n",
    "    Converts IA style timemap (a JSON array of arrays) to a list of dictionaries.\n",
    "    Renames keys to standardise IA with other Timemaps.\n",
    "    '''\n",
    "    if results:\n",
    "        keys = results[0]\n",
    "        results_as_dicts = [dict(zip(keys, v)) for v in results[1:]]\n",
    "    else:\n",
    "        results_as_dicts = results\n",
    "    for d in results_as_dicts:\n",
    "        d['status'] = d.pop('statuscode')\n",
    "        d['mime'] = d.pop('mimetype')\n",
    "        d['url'] = d.pop('original')\n",
    "    return results_as_dicts\n",
    "\n",
    "def get_capture_data_from_memento(url, request_type='head'):\n",
    "    '''\n",
    "    For OpenWayback systems this can get some extra capture info to insert into Timemaps.\n",
    "    '''\n",
    "    if request_type == 'head':\n",
    "        response = requests.head(url)\n",
    "    else:\n",
    "        response = requests.get(url)\n",
    "    headers = response.headers\n",
    "    length = headers.get('x-archive-orig-content-length')\n",
    "    status = headers.get('x-archive-orig-status')\n",
    "    status = status.split(' ')[0] if status else None\n",
    "    mime = headers.get('x-archive-orig-content-type')\n",
    "    mime = mime.split(';')[0] if mime else None\n",
    "    return {'length': length, 'status': status, 'mime': mime}\n",
    "\n",
    "def convert_link_to_json(results, enrich_data=False):\n",
    "    '''\n",
    "    Converts link formatted Timemap to JSON.\n",
    "    '''\n",
    "    data = []\n",
    "    for line in results.splitlines():\n",
    "        parts = line.split('; ')\n",
    "        if len(parts) > 1:\n",
    "            link_type = re.search(r'rel=\"(original|self|timegate|first memento|last memento|memento)\"', parts[1]).group(1)\n",
    "            if link_type == 'memento':\n",
    "                link = parts[0].strip('<>')\n",
    "                timestamp, original = re.search(r'/(\\d{14})/(.*)$', link).groups()\n",
    "                capture = {'urlkey': surt(original), 'timestamp': timestamp, 'url': original}\n",
    "                if enrich_data:\n",
    "                    capture.update(get_capture_data_from_memento(link))\n",
    "                    print(capture)\n",
    "                data.append(capture)\n",
    "    return data\n",
    "                \n",
    "def get_timemap_as_json(timegate, url, enrich_data=False):\n",
    "    '''\n",
    "    Get a Timemap then normalise results (if necessary) to return a list of dicts.\n",
    "    '''\n",
    "    tg_url = f'{TIMEGATES[timegate]}timemap/json/{url}/'\n",
    "    response = requests.get(tg_url)\n",
    "    response_type = response.headers['content-type']\n",
    "    if response_type == 'text/x-ndjson':\n",
    "        data = [json.loads(line) for line in response.text.splitlines()]\n",
    "    elif response_type == 'application/json':\n",
    "        data = convert_lists_to_dicts(response.json())\n",
    "    elif response_type in ['application/link-format', 'text/html;charset=utf-8']:\n",
    "        data = convert_link_to_json(response.text, enrich_data=enrich_data)\n",
    "    return data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Harvest the title data from the Internet Archive\n",
    "\n",
    "This gets the web page captures from the Internet Archive, scrapes the list of titles from the page, then does a bit of normalisation of the title data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "titles = []\n",
    "\n",
    "# These are the pages that listed available titles.\n",
    "# There was a change in 2016\n",
    "pages = [{'url': 'http://trove.nla.gov.au/ndp/del/titles', 'path': '/ndp/del/title/'},\n",
    "         {'url': 'https://trove.nla.gov.au/newspaper/about', 'path': '/newspaper/title/'}]\n",
    "\n",
    "for page in pages:\n",
    "    for capture in get_timemap_as_json('ia', page['url']):\n",
    "        if capture['status'] == '200':\n",
    "            url = f'https://web.archive.org/web/{capture[\"timestamp\"]}id_/{capture[\"url\"]}'\n",
    "            #print(url)\n",
    "            capture_date = arrow.get(capture['timestamp'][:8], 'YYYYMMDD').format('YYYY-MM-DD')\n",
    "            #print(capture_date)\n",
    "            response = requests.get(url)\n",
    "            soup = BeautifulSoup(response.content)\n",
    "            title_links = soup.find_all('a', href=re.compile(page['path']))\n",
    "            for title in title_links:\n",
    "                # Get the title text\n",
    "                full_title = title.get_text().strip()\n",
    "                \n",
    "                # Get the title id\n",
    "                title_id = re.search(r'\\/(\\d+)\\/?$', title['href']).group(1)\n",
    "            \n",
    "                # Most of the code below is aimed at normalising the publication place and dates values to allow for easy grouping & deduplication\n",
    "                brief_title = re.sub(r'\\(.+\\)\\s*$', '', full_title).strip()\n",
    "                try:\n",
    "                    details = re.search(r'\\((.+)\\)\\s*$', full_title).group(1).split(':')\n",
    "                except AttributeError:\n",
    "                    place = ''\n",
    "                    dates = ''\n",
    "                else:\n",
    "                    try:\n",
    "                        place = details[0].strip()\n",
    "                        # Normalise states\n",
    "                        try:\n",
    "                            place = re.sub(r'(, )?([A-Za-z]+)[\\.\\s]*$', lambda match: f'{match.group(1) if match.group(1) else \"\"}{match.group(2).upper()}', place)\n",
    "                        except AttributeError:\n",
    "                            pass\n",
    "                        # Normalise dates\n",
    "                        dates = ' - '.join([d.strip() for d in details[1].strip().split('-')])\n",
    "                    except IndexError:\n",
    "                        place = ''\n",
    "                        dates = ' - '.join([d.strip() for d in details[0].strip().split('-')])\n",
    "                titles.append({'title_id': title_id, 'full_title': full_title, 'title': brief_title, 'place': place, 'dates': dates, 'capture_date': capture_date, 'capture_timestamp': capture['timestamp']})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convert the title data to a DataFrame for analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.DataFrame(titles)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title_id</th>\n",
       "      <th>full_title</th>\n",
       "      <th>title</th>\n",
       "      <th>place</th>\n",
       "      <th>dates</th>\n",
       "      <th>capture_date</th>\n",
       "      <th>capture_timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34</td>\n",
       "      <td>Advertiser (Adelaide, SA : 1889-1931)</td>\n",
       "      <td>Advertiser</td>\n",
       "      <td>Adelaide, SA</td>\n",
       "      <td>1889 - 1931</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>13</td>\n",
       "      <td>Argus (Melbourne, Vic. : 1848-1954)</td>\n",
       "      <td>Argus</td>\n",
       "      <td>Melbourne, VIC</td>\n",
       "      <td>1848 - 1954</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>16</td>\n",
       "      <td>Brisbane Courier (Qld. : 1864-1933)</td>\n",
       "      <td>Brisbane Courier</td>\n",
       "      <td>QLD</td>\n",
       "      <td>1864 - 1933</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>11</td>\n",
       "      <td>Canberra Times (ACT : 1926-1954)</td>\n",
       "      <td>Canberra Times</td>\n",
       "      <td>ACT</td>\n",
       "      <td>1926 - 1954</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>24</td>\n",
       "      <td>Colonial Times (Hobart, Tas. : 1828-1857)</td>\n",
       "      <td>Colonial Times</td>\n",
       "      <td>Hobart, TAS</td>\n",
       "      <td>1828 - 1857</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90111</th>\n",
       "      <td>1374</td>\n",
       "      <td>Papuan Times (Port Moresby, Papua New Guinea :...</td>\n",
       "      <td>Papuan Times</td>\n",
       "      <td>Port Moresby, Papua New GUINEA</td>\n",
       "      <td>1911 - 1916</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90112</th>\n",
       "      <td>1369</td>\n",
       "      <td>Territory of Papua Government Gazette (Papua N...</td>\n",
       "      <td>Territory of Papua Government Gazette</td>\n",
       "      <td>Papua New GUINEA</td>\n",
       "      <td>1906 - 1942</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90113</th>\n",
       "      <td>1371</td>\n",
       "      <td>Territory of Papua and New Guinea Government G...</td>\n",
       "      <td>Territory of Papua and New Guinea Government G...</td>\n",
       "      <td></td>\n",
       "      <td>1949 - 1971</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90114</th>\n",
       "      <td>1370</td>\n",
       "      <td>Territory of Papua-New Guinea Government Gazet...</td>\n",
       "      <td>Territory of Papua-New Guinea Government Gazette</td>\n",
       "      <td></td>\n",
       "      <td>1945 - 1949</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90115</th>\n",
       "      <td>1391</td>\n",
       "      <td>Tribune (Philippines : 1932 - 1945)</td>\n",
       "      <td>Tribune</td>\n",
       "      <td>PHILIPPINES</td>\n",
       "      <td>1932 - 1945</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>90116 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      title_id                                         full_title  \\\n",
       "0           34              Advertiser (Adelaide, SA : 1889-1931)   \n",
       "1           13                Argus (Melbourne, Vic. : 1848-1954)   \n",
       "2           16                Brisbane Courier (Qld. : 1864-1933)   \n",
       "3           11                   Canberra Times (ACT : 1926-1954)   \n",
       "4           24          Colonial Times (Hobart, Tas. : 1828-1857)   \n",
       "...        ...                                                ...   \n",
       "90111     1374  Papuan Times (Port Moresby, Papua New Guinea :...   \n",
       "90112     1369  Territory of Papua Government Gazette (Papua N...   \n",
       "90113     1371  Territory of Papua and New Guinea Government G...   \n",
       "90114     1370  Territory of Papua-New Guinea Government Gazet...   \n",
       "90115     1391                Tribune (Philippines : 1932 - 1945)   \n",
       "\n",
       "                                                   title  \\\n",
       "0                                             Advertiser   \n",
       "1                                                  Argus   \n",
       "2                                       Brisbane Courier   \n",
       "3                                         Canberra Times   \n",
       "4                                         Colonial Times   \n",
       "...                                                  ...   \n",
       "90111                                       Papuan Times   \n",
       "90112              Territory of Papua Government Gazette   \n",
       "90113  Territory of Papua and New Guinea Government G...   \n",
       "90114   Territory of Papua-New Guinea Government Gazette   \n",
       "90115                                            Tribune   \n",
       "\n",
       "                                place        dates capture_date  \\\n",
       "0                        Adelaide, SA  1889 - 1931   2009-11-12   \n",
       "1                      Melbourne, VIC  1848 - 1954   2009-11-12   \n",
       "2                                 QLD  1864 - 1933   2009-11-12   \n",
       "3                                 ACT  1926 - 1954   2009-11-12   \n",
       "4                         Hobart, TAS  1828 - 1857   2009-11-12   \n",
       "...                               ...          ...          ...   \n",
       "90111  Port Moresby, Papua New GUINEA  1911 - 1916   2021-04-15   \n",
       "90112                Papua New GUINEA  1906 - 1942   2021-04-15   \n",
       "90113                                  1949 - 1971   2021-04-15   \n",
       "90114                                  1945 - 1949   2021-04-15   \n",
       "90115                     PHILIPPINES  1932 - 1945   2021-04-15   \n",
       "\n",
       "      capture_timestamp  \n",
       "0        20091112000713  \n",
       "1        20091112000713  \n",
       "2        20091112000713  \n",
       "3        20091112000713  \n",
       "4        20091112000713  \n",
       "...                 ...  \n",
       "90111    20210415021550  \n",
       "90112    20210415021550  \n",
       "90113    20210415021550  \n",
       "90114    20210415021550  \n",
       "90115    20210415021550  \n",
       "\n",
       "[90116 rows x 7 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "120"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Number of captures\n",
    "len(df['capture_timestamp'].unique())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "111"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Number of days on which the pages were captured\n",
    "len(df['capture_date'].unique())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save this dataset as a CSV file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv('trove_newspaper_titles_2009_2021.csv', index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How did the number of titles change over time?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>capture_date</th>\n",
       "      <th>total</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>1666</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2021-03-11</td>\n",
       "      <td>1658</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2021-02-05</td>\n",
       "      <td>1649</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2020-11-12</td>\n",
       "      <td>1625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2020-05-10</td>\n",
       "      <td>1553</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>106</th>\n",
       "      <td>2010-04-28</td>\n",
       "      <td>37</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>107</th>\n",
       "      <td>2009-11-24</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>108</th>\n",
       "      <td>2009-12-12</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>109</th>\n",
       "      <td>2009-11-22</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>110</th>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>34</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>111 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    capture_date  total\n",
       "0     2021-04-15   1666\n",
       "1     2021-03-11   1658\n",
       "2     2021-02-05   1649\n",
       "3     2020-11-12   1625\n",
       "4     2020-05-10   1553\n",
       "..           ...    ...\n",
       "106   2010-04-28     37\n",
       "107   2009-11-24     34\n",
       "108   2009-12-12     34\n",
       "109   2009-11-22     34\n",
       "110   2009-11-12     34\n",
       "\n",
       "[111 rows x 2 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Drop duplicates in cases where there were mutiple captures on a single day\n",
    "captures_df = df.drop_duplicates(subset=['capture_date', 'full_title'])\n",
    "\n",
    "# Calculate totals per capture\n",
    "capture_totals = captures_df['capture_date'].value_counts().to_frame().reset_index()\n",
    "capture_totals.columns = ['capture_date', 'total']\n",
    "capture_totals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<div id=\"altair-viz-da97fd15b3ba4f72a26f19ac84a42194\"></div>\n",
       "<script type=\"text/javascript\">\n",
       "  (function(spec, embedOpt){\n",
       "    let outputDiv = document.currentScript.previousElementSibling;\n",
       "    if (outputDiv.id !== \"altair-viz-da97fd15b3ba4f72a26f19ac84a42194\") {\n",
       "      outputDiv = document.getElementById(\"altair-viz-da97fd15b3ba4f72a26f19ac84a42194\");\n",
       "    }\n",
       "    const paths = {\n",
       "      \"vega\": \"https://cdn.jsdelivr.net/npm//vega@5?noext\",\n",
       "      \"vega-lib\": \"https://cdn.jsdelivr.net/npm//vega-lib?noext\",\n",
       "      \"vega-lite\": \"https://cdn.jsdelivr.net/npm//vega-lite@4.8.1?noext\",\n",
       "      \"vega-embed\": \"https://cdn.jsdelivr.net/npm//vega-embed@6?noext\",\n",
       "    };\n",
       "\n",
       "    function loadScript(lib) {\n",
       "      return new Promise(function(resolve, reject) {\n",
       "        var s = document.createElement('script');\n",
       "        s.src = paths[lib];\n",
       "        s.async = true;\n",
       "        s.onload = () => resolve(paths[lib]);\n",
       "        s.onerror = () => reject(`Error loading script: ${paths[lib]}`);\n",
       "        document.getElementsByTagName(\"head\")[0].appendChild(s);\n",
       "      });\n",
       "    }\n",
       "\n",
       "    function showError(err) {\n",
       "      outputDiv.innerHTML = `<div class=\"error\" style=\"color:red;\">${err}</div>`;\n",
       "      throw err;\n",
       "    }\n",
       "\n",
       "    function displayChart(vegaEmbed) {\n",
       "      vegaEmbed(outputDiv, spec, embedOpt)\n",
       "        .catch(err => showError(`Javascript Error: ${err.message}<br>This usually means there's a typo in your chart specification. See the javascript console for the full traceback.`));\n",
       "    }\n",
       "\n",
       "    if(typeof define === \"function\" && define.amd) {\n",
       "      requirejs.config({paths});\n",
       "      require([\"vega-embed\"], displayChart, err => showError(`Error loading script: ${err.message}`));\n",
       "    } else if (typeof vegaEmbed === \"function\") {\n",
       "      displayChart(vegaEmbed);\n",
       "    } else {\n",
       "      loadScript(\"vega\")\n",
       "        .then(() => loadScript(\"vega-lite\"))\n",
       "        .then(() => loadScript(\"vega-embed\"))\n",
       "        .catch(showError)\n",
       "        .then(() => displayChart(vegaEmbed));\n",
       "    }\n",
       "  })({\"config\": {\"view\": {\"continuousWidth\": 400, \"continuousHeight\": 300}}, \"data\": {\"name\": \"data-c8b1ad7dc677c2d35bdf972859448e19\"}, \"mark\": {\"type\": \"line\", \"point\": true}, \"encoding\": {\"tooltip\": [{\"type\": \"temporal\", \"field\": \"capture_date\", \"format\": \"%e %b %Y\"}, {\"type\": \"quantitative\", \"field\": \"total\"}], \"x\": {\"type\": \"temporal\", \"field\": \"capture_date\", \"title\": \"Date captured\"}, \"y\": {\"type\": \"quantitative\", \"field\": \"total\", \"title\": \"Number of newspaper titles\"}}, \"width\": 700, \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.8.1.json\", \"datasets\": {\"data-c8b1ad7dc677c2d35bdf972859448e19\": [{\"capture_date\": \"2021-04-15\", \"total\": 1666}, {\"capture_date\": \"2021-03-11\", \"total\": 1658}, {\"capture_date\": \"2021-02-05\", \"total\": 1649}, {\"capture_date\": \"2020-11-12\", \"total\": 1625}, {\"capture_date\": \"2020-05-10\", \"total\": 1553}, {\"capture_date\": \"2020-01-15\", \"total\": 1505}, {\"capture_date\": \"2019-10-28\", \"total\": 1488}, {\"capture_date\": \"2019-06-26\", \"total\": 1454}, {\"capture_date\": \"2019-05-19\", \"total\": 1447}, {\"capture_date\": \"2019-03-22\", \"total\": 1434}, {\"capture_date\": \"2018-11-28\", \"total\": 1399}, {\"capture_date\": \"2018-11-29\", \"total\": 1399}, {\"capture_date\": \"2018-10-15\", \"total\": 1371}, {\"capture_date\": \"2018-08-19\", \"total\": 1367}, {\"capture_date\": \"2018-09-19\", \"total\": 1367}, {\"capture_date\": \"2017-10-22\", \"total\": 1296}, {\"capture_date\": \"2017-09-20\", \"total\": 1291}, {\"capture_date\": \"2017-07-18\", \"total\": 1280}, {\"capture_date\": \"2017-08-20\", \"total\": 1280}, {\"capture_date\": \"2017-08-05\", \"total\": 1280}, {\"capture_date\": \"2017-08-22\", \"total\": 1280}, {\"capture_date\": \"2017-07-01\", \"total\": 1279}, {\"capture_date\": \"2017-06-24\", \"total\": 1274}, {\"capture_date\": \"2017-05-28\", \"total\": 1272}, {\"capture_date\": \"2017-04-28\", \"total\": 1259}, {\"capture_date\": \"2016-09-12\", \"total\": 1199}, {\"capture_date\": \"2016-08-10\", \"total\": 1199}, {\"capture_date\": \"2016-08-18\", \"total\": 1199}, {\"capture_date\": \"2016-11-21\", \"total\": 1199}, {\"capture_date\": \"2016-07-21\", \"total\": 1198}, {\"capture_date\": \"2016-06-29\", \"total\": 1187}, {\"capture_date\": \"2016-06-28\", \"total\": 1186}, {\"capture_date\": \"2016-06-04\", \"total\": 1148}, {\"capture_date\": \"2016-05-05\", \"total\": 1123}, {\"capture_date\": \"2016-04-04\", \"total\": 1112}, {\"capture_date\": \"2016-03-03\", \"total\": 1108}, {\"capture_date\": \"2016-02-29\", \"total\": 1104}, {\"capture_date\": \"2016-01-17\", \"total\": 1086}, {\"capture_date\": \"2016-01-14\", \"total\": 1085}, {\"capture_date\": \"2016-01-08\", \"total\": 1081}, {\"capture_date\": \"2015-11-18\", \"total\": 1046}, {\"capture_date\": \"2015-11-13\", \"total\": 1041}, {\"capture_date\": \"2015-11-10\", \"total\": 1039}, {\"capture_date\": \"2015-11-01\", \"total\": 1031}, {\"capture_date\": \"2015-09-14\", \"total\": 973}, {\"capture_date\": \"2015-09-05\", \"total\": 970}, {\"capture_date\": \"2015-09-01\", \"total\": 969}, {\"capture_date\": \"2015-08-21\", \"total\": 963}, {\"capture_date\": \"2015-08-13\", \"total\": 956}, {\"capture_date\": \"2015-07-02\", \"total\": 921}, {\"capture_date\": \"2015-04-25\", \"total\": 878}, {\"capture_date\": \"2015-04-01\", \"total\": 873}, {\"capture_date\": \"2015-03-01\", \"total\": 852}, {\"capture_date\": \"2015-02-24\", \"total\": 849}, {\"capture_date\": \"2015-02-23\", \"total\": 848}, {\"capture_date\": \"2015-01-28\", \"total\": 842}, {\"capture_date\": \"2014-12-28\", \"total\": 823}, {\"capture_date\": \"2014-10-22\", \"total\": 785}, {\"capture_date\": \"2014-10-14\", \"total\": 782}, {\"capture_date\": \"2014-09-24\", \"total\": 769}, {\"capture_date\": \"2014-07-31\", \"total\": 711}, {\"capture_date\": \"2014-07-22\", \"total\": 711}, {\"capture_date\": \"2014-05-30\", \"total\": 700}, {\"capture_date\": \"2014-04-07\", \"total\": 672}, {\"capture_date\": \"2014-02-11\", \"total\": 663}, {\"capture_date\": \"2013-12-07\", \"total\": 634}, {\"capture_date\": \"2013-10-28\", \"total\": 607}, {\"capture_date\": \"2013-10-12\", \"total\": 601}, {\"capture_date\": \"2013-10-06\", \"total\": 598}, {\"capture_date\": \"2013-05-09\", \"total\": 430}, {\"capture_date\": \"2013-01-17\", \"total\": 349}, {\"capture_date\": \"2012-12-27\", \"total\": 345}, {\"capture_date\": \"2012-10-30\", \"total\": 335}, {\"capture_date\": \"2012-10-27\", \"total\": 334}, {\"capture_date\": \"2012-10-26\", \"total\": 334}, {\"capture_date\": \"2012-09-23\", \"total\": 322}, {\"capture_date\": \"2012-09-09\", \"total\": 322}, {\"capture_date\": \"2012-08-25\", \"total\": 297}, {\"capture_date\": \"2012-08-22\", \"total\": 296}, {\"capture_date\": \"2012-06-27\", \"total\": 284}, {\"capture_date\": \"2012-06-26\", \"total\": 284}, {\"capture_date\": \"2012-05-13\", \"total\": 275}, {\"capture_date\": \"2012-05-10\", \"total\": 274}, {\"capture_date\": \"2012-05-07\", \"total\": 270}, {\"capture_date\": \"2012-05-04\", \"total\": 269}, {\"capture_date\": \"2012-05-06\", \"total\": 269}, {\"capture_date\": \"2012-04-19\", \"total\": 266}, {\"capture_date\": \"2012-02-15\", \"total\": 250}, {\"capture_date\": \"2012-01-20\", \"total\": 242}, {\"capture_date\": \"2012-01-07\", \"total\": 234}, {\"capture_date\": \"2011-12-19\", \"total\": 225}, {\"capture_date\": \"2011-12-06\", \"total\": 221}, {\"capture_date\": \"2011-11-16\", \"total\": 216}, {\"capture_date\": \"2011-10-27\", \"total\": 215}, {\"capture_date\": \"2011-10-19\", \"total\": 214}, {\"capture_date\": \"2011-10-16\", \"total\": 210}, {\"capture_date\": \"2011-09-23\", \"total\": 205}, {\"capture_date\": \"2011-09-04\", \"total\": 194}, {\"capture_date\": \"2011-08-14\", \"total\": 176}, {\"capture_date\": \"2011-05-14\", \"total\": 137}, {\"capture_date\": \"2011-04-06\", \"total\": 118}, {\"capture_date\": \"2011-03-26\", \"total\": 112}, {\"capture_date\": \"2011-03-12\", \"total\": 104}, {\"capture_date\": \"2011-02-27\", \"total\": 99}, {\"capture_date\": \"2010-05-01\", \"total\": 37}, {\"capture_date\": \"2010-04-16\", \"total\": 37}, {\"capture_date\": \"2010-04-28\", \"total\": 37}, {\"capture_date\": \"2009-11-24\", \"total\": 34}, {\"capture_date\": \"2009-12-12\", \"total\": 34}, {\"capture_date\": \"2009-11-22\", \"total\": 34}, {\"capture_date\": \"2009-11-12\", \"total\": 34}]}}, {\"mode\": \"vega-lite\"});\n",
       "</script>"
      ],
      "text/plain": [
       "alt.Chart(...)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "alt.Chart(capture_totals).mark_line(point=True).encode(\n",
    "    x=alt.X('capture_date:T', title='Date captured'),\n",
    "    y=alt.Y('total:Q', title='Number of newspaper titles'),\n",
    "    tooltip=[alt.Tooltip('capture_date:T', format='%e %b %Y'), 'total:Q'],\n",
    ").properties(width=700)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## When did titles first appear?\n",
    "\n",
    "For historiographical purposes, its useful to know when a particular title first appeared in Trove. Here we'll only keep the first appearance of each title (or any subsequent changes to its date range / location)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "first_appearance = df.drop_duplicates(subset=['title', 'place', 'dates'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title_id</th>\n",
       "      <th>full_title</th>\n",
       "      <th>title</th>\n",
       "      <th>place</th>\n",
       "      <th>dates</th>\n",
       "      <th>capture_date</th>\n",
       "      <th>capture_timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>34</td>\n",
       "      <td>Advertiser (Adelaide, SA : 1889-1931)</td>\n",
       "      <td>Advertiser</td>\n",
       "      <td>Adelaide, SA</td>\n",
       "      <td>1889 - 1931</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>13</td>\n",
       "      <td>Argus (Melbourne, Vic. : 1848-1954)</td>\n",
       "      <td>Argus</td>\n",
       "      <td>Melbourne, VIC</td>\n",
       "      <td>1848 - 1954</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>16</td>\n",
       "      <td>Brisbane Courier (Qld. : 1864-1933)</td>\n",
       "      <td>Brisbane Courier</td>\n",
       "      <td>QLD</td>\n",
       "      <td>1864 - 1933</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>11</td>\n",
       "      <td>Canberra Times (ACT : 1926-1954)</td>\n",
       "      <td>Canberra Times</td>\n",
       "      <td>ACT</td>\n",
       "      <td>1926 - 1954</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>24</td>\n",
       "      <td>Colonial Times (Hobart, Tas. : 1828-1857)</td>\n",
       "      <td>Colonial Times</td>\n",
       "      <td>Hobart, TAS</td>\n",
       "      <td>1828 - 1857</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89211</th>\n",
       "      <td>1700</td>\n",
       "      <td>Port Lincoln, Tumby and West Coast Recorder (S...</td>\n",
       "      <td>Port Lincoln, Tumby and West Coast Recorder</td>\n",
       "      <td>SA</td>\n",
       "      <td>1904 - 1909</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89258</th>\n",
       "      <td>1702</td>\n",
       "      <td>West Coast Recorder (Port Lincoln, SA : 1909 -...</td>\n",
       "      <td>West Coast Recorder</td>\n",
       "      <td>Port Lincoln, SA</td>\n",
       "      <td>1909 - 1942</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89487</th>\n",
       "      <td>1703</td>\n",
       "      <td>Express, Bacchus Marsh (Vic. : 1943 - 1954)</td>\n",
       "      <td>Express, Bacchus Marsh</td>\n",
       "      <td>VIC</td>\n",
       "      <td>1943 - 1954</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89671</th>\n",
       "      <td>310</td>\n",
       "      <td>Richmond Guardian (Vic. : 1885; 1904 - 1922)</td>\n",
       "      <td>Richmond Guardian</td>\n",
       "      <td>VIC</td>\n",
       "      <td>1885; 1904 - 1922</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89944</th>\n",
       "      <td>1638</td>\n",
       "      <td>Miner's Right (Boulder, WA : 1897)</td>\n",
       "      <td>Miner's Right</td>\n",
       "      <td>Boulder, WA</td>\n",
       "      <td>1897</td>\n",
       "      <td>2021-04-15</td>\n",
       "      <td>20210415021550</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2040 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      title_id                                         full_title  \\\n",
       "0           34              Advertiser (Adelaide, SA : 1889-1931)   \n",
       "1           13                Argus (Melbourne, Vic. : 1848-1954)   \n",
       "2           16                Brisbane Courier (Qld. : 1864-1933)   \n",
       "3           11                   Canberra Times (ACT : 1926-1954)   \n",
       "4           24          Colonial Times (Hobart, Tas. : 1828-1857)   \n",
       "...        ...                                                ...   \n",
       "89211     1700  Port Lincoln, Tumby and West Coast Recorder (S...   \n",
       "89258     1702  West Coast Recorder (Port Lincoln, SA : 1909 -...   \n",
       "89487     1703        Express, Bacchus Marsh (Vic. : 1943 - 1954)   \n",
       "89671      310       Richmond Guardian (Vic. : 1885; 1904 - 1922)   \n",
       "89944     1638                 Miner's Right (Boulder, WA : 1897)   \n",
       "\n",
       "                                             title             place  \\\n",
       "0                                       Advertiser      Adelaide, SA   \n",
       "1                                            Argus    Melbourne, VIC   \n",
       "2                                 Brisbane Courier               QLD   \n",
       "3                                   Canberra Times               ACT   \n",
       "4                                   Colonial Times       Hobart, TAS   \n",
       "...                                            ...               ...   \n",
       "89211  Port Lincoln, Tumby and West Coast Recorder                SA   \n",
       "89258                          West Coast Recorder  Port Lincoln, SA   \n",
       "89487                       Express, Bacchus Marsh               VIC   \n",
       "89671                            Richmond Guardian               VIC   \n",
       "89944                                Miner's Right       Boulder, WA   \n",
       "\n",
       "                   dates capture_date capture_timestamp  \n",
       "0            1889 - 1931   2009-11-12    20091112000713  \n",
       "1            1848 - 1954   2009-11-12    20091112000713  \n",
       "2            1864 - 1933   2009-11-12    20091112000713  \n",
       "3            1926 - 1954   2009-11-12    20091112000713  \n",
       "4            1828 - 1857   2009-11-12    20091112000713  \n",
       "...                  ...          ...               ...  \n",
       "89211        1904 - 1909   2021-04-15    20210415021550  \n",
       "89258        1909 - 1942   2021-04-15    20210415021550  \n",
       "89487        1943 - 1954   2021-04-15    20210415021550  \n",
       "89671  1885; 1904 - 1922   2021-04-15    20210415021550  \n",
       "89944               1897   2021-04-15    20210415021550  \n",
       "\n",
       "[2040 rows x 7 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "first_appearance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Find when a particular newspaper first appeared."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title_id</th>\n",
       "      <th>full_title</th>\n",
       "      <th>title</th>\n",
       "      <th>place</th>\n",
       "      <th>dates</th>\n",
       "      <th>capture_date</th>\n",
       "      <th>capture_timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>11</td>\n",
       "      <td>Canberra Times (ACT : 1926-1954)</td>\n",
       "      <td>Canberra Times</td>\n",
       "      <td>ACT</td>\n",
       "      <td>1926 - 1954</td>\n",
       "      <td>2009-11-12</td>\n",
       "      <td>20091112000713</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9395</th>\n",
       "      <td>11</td>\n",
       "      <td>Canberra Times (ACT : 1926 - 1995)</td>\n",
       "      <td>Canberra Times</td>\n",
       "      <td>ACT</td>\n",
       "      <td>1926 - 1995</td>\n",
       "      <td>2012-12-27</td>\n",
       "      <td>20121227113753</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     title_id                          full_title           title place  \\\n",
       "3          11    Canberra Times (ACT : 1926-1954)  Canberra Times   ACT   \n",
       "9395       11  Canberra Times (ACT : 1926 - 1995)  Canberra Times   ACT   \n",
       "\n",
       "            dates capture_date capture_timestamp  \n",
       "3     1926 - 1954   2009-11-12    20091112000713  \n",
       "9395  1926 - 1995   2012-12-27    20121227113753  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "first_appearance.loc[first_appearance['title'] == 'Canberra Times']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generate an alphabetical list for easy browsing. View the [results as a Gist](https://gist.github.com/wragge/7d80507c3e7957e271c572b8f664031a)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "with Path('titles_list.md').open('w') as titles_list:\n",
    "    for title, group in first_appearance.groupby(['title', 'title_id']):\n",
    "        places = ' | '.join(group['place'].unique())\n",
    "        titles_list.write(f'<h4><a href=\"http://nla.gov.au/nla.news-title{title[1]}\">{title[0]} ({places})</a></h4>')\n",
    "        titles_list.write(group.sort_values(by='capture_date')[['capture_date','dates', 'place']].to_html(index=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save this dataset to CSV."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "first_appearance.to_csv('trove_newspaper_titles_first_appearance_2009_2021.csv', index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "\n",
    "Created by [Tim Sherratt](https://timsherratt.org/) for the [GLAM Workbench](https://glam-workbench.github.io/).  \n",
    "Support this project by becoming a [GitHub sponsor](https://github.com/sponsors/wragge?o=esb)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}