{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Queensland State Archives, Naturalisations, 1851 to 1904\n", "\n", "## Add series information to index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [Naturalisations, 1851 to 1904](https://data.qld.gov.au/dataset/naturalisations-1851-to-1904) index is available from the Queensland Government data portal. The notes explain:\n", "\n", "> This index was created from various records detailing the names of those who took oaths of allegiance to be naturalised as created by the Supreme Court across Queensland as well as the Colonial Secretary's Office and the Government Residents Office.\n", "\n", "It's not clear, however, that this means that the index collates name entries from a number of different series, with a separate row for each name reference. This means that there can be multiple rows referring to the naturalisation of a single individual. This is obviously important to keep in mind if you're trying to analyse aggregate data relating to naturalisations in Queensland.\n", "\n", "This notebook adds series information to the original index so that you can filter the data by series." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RendererRegistry.enable('notebook')" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "import altair as alt\n", "import requests\n", "from bs4 import BeautifulSoup\n", "from tqdm import tqdm_notebook\n", "from tqdm.auto import tqdm\n", "from IPython.display import display, HTML, FileLink\n", "alt.renderers.enable('notebook')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load the data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# The encoding option is necessary to avoid unicode errors\n", "df = pd.read_csv('https://data.qld.gov.au/dataset/91970fa7-d3c3-4171-a89d-410481cb90e9/resource/7b5ddae5-78ef-4d8e-b800-56e6f30d26d5/download/naturalisations-1851-1908.csv', encoding='ISO-8859-1', keep_default_na=False)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Last nameGiven namesNumberPageYearItem IDQSA refMicrofilm noNotesIndex nameDescriptionSource
0AANENSENGunder2019011697781A/49120Z1999Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...
1AAROEKnud Lauritzen75321885882267SCT/CF16Z2206Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...
2AAROEKnud Lauritzen7532421885841183SCT/CF37Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...
3AASKOOHans Pedersen4050A1877841182SCT/CF36Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...
4AASKOOHans Pedersen4050B1877841182SCT/CF36Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...
\n", "
" ], "text/plain": [ " Last name Given names Number Page Year Item ID QSA ref Microfilm no \\\n", "0 AANENSEN Gunder 20 1901 1697781 A/49120 Z1999 \n", "1 AAROE Knud Lauritzen 7532 1885 882267 SCT/CF16 Z2206 \n", "2 AAROE Knud Lauritzen 7532 42 1885 841183 SCT/CF37 Z2286 \n", "3 AASKOO Hans Pedersen 4050 A 1877 841182 SCT/CF36 Z2286 \n", "4 AASKOO Hans Pedersen 4050 B 1877 841182 SCT/CF36 Z2286 \n", "\n", " Notes Index name \\\n", "0 Naturalisations 1851-1908 \n", "1 Naturalisations 1851-1908 \n", "2 Naturalisations 1851-1908 \n", "3 Naturalisations 1851-1908 \n", "4 Naturalisations 1851-1908 \n", "\n", " Description \\\n", "0 Generated from records created by the Supreme ... \n", "1 Generated from records created by the Supreme ... \n", "2 Generated from records created by the Supreme ... \n", "3 Generated from records created by the Supreme ... \n", "4 Generated from records created by the Supreme ... \n", "\n", " Source \n", "0 http://www.archivessearch.qld.gov.au/Search/It... \n", "1 http://www.archivessearch.qld.gov.au/Search/It... \n", "2 http://www.archivessearch.qld.gov.au/Search/It... \n", "3 http://www.archivessearch.qld.gov.au/Search/It... \n", "4 http://www.archivessearch.qld.gov.au/Search/It... " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "26769" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# How many rows?\n", "len(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add series information" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One way of removing duplicates is to filter the results by series. However, while each entry includes an `Item ID`, it doesn't include a series identifier. To get the series information we have to request the item details web page, and scrape the series information from it.\n", "\n", "Rather than loop through the whole dataset, we'll grab all the unique `ItemID` values first, get the series information for each, then merge this data back into the original dataset." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "00486416bf9e4a259c5faf6f3b97c6ea", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0, description='Progress', max=91, style=ProgressStyle(description_width='iniā€¦" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "def get_series(row):\n", " '''\n", " Get the series id and title for the item identifier in the given row.\n", " '''\n", " response = requests.get('http://www.archivessearch.qld.gov.au/Search/ItemDetails.aspx', params={'ItemId': row['item_id']})\n", " soup = BeautifulSoup(response.text)\n", " series_id = soup.find(id='ctl00_cphMain_RecordDetailsView_SeriesFormView_SERIES_IDLabel').string\n", " series_title = soup.find('a', id='ctl00_cphMain_RecordDetailsView_SeriesFormView_TitleHyperLink').string\n", " return pd.Series([series_id, series_title])\n", "\n", "tqdm.pandas(desc=\"Progress\")\n", "# Get the unique item ids\n", "item_ids = pd.DataFrame(df['Item ID'].unique())\n", "item_ids.columns = ['item_id']\n", "# Get series data for each item id\n", "item_ids[['series_id', 'series_title']] = item_ids.progress_apply(get_series, axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a dataframe linking item ids to series information, we can find out something about the series represented in the original dataset." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
series_idseries_title
09403Naturalisation Files
15741Oaths of Allegiance Sworn by Aliens Being Naturalised
25177Registers of Aliens to Whom Oaths of Allegiance for Naturalisation Were Administered
58400Special Batches
67224Oaths of Allegiance
295743Certificates of Naturalisation and Associated Papers
437164Register of Fees of Office
5012748Letters Addressed to the Government Resident by the Colonial Secretary, Sydney
545745Applications for Copies of Records of Naturalisation, and Related Correspondence
675253Inwards Correspondence
\n", "
" ], "text/plain": [ " series_id \\\n", "0 9403 \n", "1 5741 \n", "2 5177 \n", "5 8400 \n", "6 7224 \n", "29 5743 \n", "43 7164 \n", "50 12748 \n", "54 5745 \n", "67 5253 \n", "\n", " series_title \n", "0 Naturalisation Files \n", "1 Oaths of Allegiance Sworn by Aliens Being Naturalised \n", "2 Registers of Aliens to Whom Oaths of Allegiance for Naturalisation Were Administered \n", "5 Special Batches \n", "6 Oaths of Allegiance \n", "29 Certificates of Naturalisation and Associated Papers \n", "43 Register of Fees of Office \n", "50 Letters Addressed to the Government Resident by the Colonial Secretary, Sydney \n", "54 Applications for Copies of Records of Naturalisation, and Related Correspondence \n", "67 Inwards Correspondence " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.set_option('display.max_colwidth', -1)\n", "# List the indexed series\n", "item_ids[['series_id', 'series_title']].drop_duplicates()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Merge the series data back into the original dataset." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Last nameGiven namesNumberPageYearItem IDQSA refMicrofilm noNotesIndex nameDescriptionSourceitem_idseries_idseries_title
0AANENSENGunder2019011697781A/49120Z1999Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...16977819403Naturalisation Files
1AAROEKnud Lauritzen75321885882267SCT/CF16Z2206Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...8822675741Oaths of Allegiance Sworn by Aliens Being Natu...
2AAROEKnud Lauritzen7532421885841183SCT/CF37Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...8411835177Registers of Aliens to Whom Oaths of Allegianc...
3AASKOOHans Pedersen4050A1877841182SCT/CF36Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...8411825177Registers of Aliens to Whom Oaths of Allegianc...
4AASKOOHans Pedersen4050B1877841182SCT/CF36Z2286Naturalisations 1851-1908Generated from records created by the Supreme ...http://www.archivessearch.qld.gov.au/Search/It...8411825177Registers of Aliens to Whom Oaths of Allegianc...
\n", "
" ], "text/plain": [ " Last name Given names Number Page Year Item ID QSA ref Microfilm no \\\n", "0 AANENSEN Gunder 20 1901 1697781 A/49120 Z1999 \n", "1 AAROE Knud Lauritzen 7532 1885 882267 SCT/CF16 Z2206 \n", "2 AAROE Knud Lauritzen 7532 42 1885 841183 SCT/CF37 Z2286 \n", "3 AASKOO Hans Pedersen 4050 A 1877 841182 SCT/CF36 Z2286 \n", "4 AASKOO Hans Pedersen 4050 B 1877 841182 SCT/CF36 Z2286 \n", "\n", " Notes Index name \\\n", "0 Naturalisations 1851-1908 \n", "1 Naturalisations 1851-1908 \n", "2 Naturalisations 1851-1908 \n", "3 Naturalisations 1851-1908 \n", "4 Naturalisations 1851-1908 \n", "\n", " Description \\\n", "0 Generated from records created by the Supreme ... \n", "1 Generated from records created by the Supreme ... \n", "2 Generated from records created by the Supreme ... \n", "3 Generated from records created by the Supreme ... \n", "4 Generated from records created by the Supreme ... \n", "\n", " Source item_id series_id \\\n", "0 http://www.archivessearch.qld.gov.au/Search/It... 1697781 9403 \n", "1 http://www.archivessearch.qld.gov.au/Search/It... 882267 5741 \n", "2 http://www.archivessearch.qld.gov.au/Search/It... 841183 5177 \n", "3 http://www.archivessearch.qld.gov.au/Search/It... 841182 5177 \n", "4 http://www.archivessearch.qld.gov.au/Search/It... 841182 5177 \n", "\n", " series_title \n", "0 Naturalisation Files \n", "1 Oaths of Allegiance Sworn by Aliens Being Natu... \n", "2 Registers of Aliens to Whom Oaths of Allegianc... \n", "3 Registers of Aliens to Whom Oaths of Allegianc... \n", "4 Registers of Aliens to Whom Oaths of Allegianc... " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.set_option('display.max_colwidth', 50)\n", "# Merge the series data into the original dataset\n", "qld_df = pd.merge(df, item_ids, left_on='Item ID', right_on='item_id', how='left')\n", "qld_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Save as CSV\n", "\n", "Save the enriched dataset as a CSV file." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "qsa_naturalisations_index_with_series.csv
" ], "text/plain": [ "/Users/tim/mycode/glam-workbench/qsa/notebooks/qsa_naturalisations_index_with_series.csv" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "qld_df.to_csv('qsa_naturalisations_index_with_series.csv', index=False)\n", "display(FileLink('qsa_naturalisations_index_with_series.csv'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Explore by series\n", "\n", "Now that we've associated each entry in the original dataset with a series, we can break the data down by series to better understand the content of the index." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5177 14325\n", "5741 10344\n", "7224 734\n", "8400 528\n", "9403 258\n", "5743 230\n", "5745 131\n", "7164 122\n", "5253 82\n", "12748 15\n", "Name: series_id, dtype: int64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Number of entries per series\n", "qld_df['series_id'].value_counts()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def filter_by_series(series_id):\n", " '''\n", " Filter datatset by series id.\n", " '''\n", " filtered_df = qld_df.loc[qld_df['series_id'] == series_id].copy()\n", " return filtered_df\n", "\n", "def get_counts_by_year(df):\n", " '''\n", " Aggregate data by year and prepare for charting.\n", " '''\n", " counts = df['Year'].groupby([df['Year']]).agg('count').to_frame()\n", " counts.columns = ['count']\n", " counts = counts.reset_index()\n", " # Filter out date errors\n", " counts = counts.loc[(counts['Year'] > 0) & (counts['Year'] < 1910)].copy()\n", " # Create a datetime field\n", " counts['date'] = pd.to_datetime(counts['Year'], format='%Y')\n", " return counts" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "application/javascript": [ "var spec = {\"config\": {\"view\": {\"width\": 400, \"height\": 300}}, \"vconcat\": [{\"data\": {\"name\": \"data-83569b0b1790191e9a3337e4cd9c3c53\"}, \"mark\": {\"type\": \"bar\", \"size\": 10}, \"encoding\": {\"x\": {\"type\": \"temporal\", \"field\": \"date\", \"timeUnit\": \"year\", \"title\": \"Year\"}, \"y\": {\"type\": \"quantitative\", \"field\": \"count\"}}, \"title\": \"Series 5177\", \"width\": 700}, {\"data\": {\"name\": \"data-01fa06077bca74b775ca36596df7a92a\"}, \"mark\": {\"type\": \"bar\", \"size\": 10}, \"encoding\": {\"x\": {\"type\": \"temporal\", \"field\": \"date\", \"timeUnit\": \"year\", \"title\": \"Year\"}, \"y\": {\"type\": \"quantitative\", \"field\": \"count\"}}, \"title\": \"Series 5741\", \"width\": 700}], \"$schema\": \"https://vega.github.io/schema/vega-lite/v2.6.0.json\", \"datasets\": {\"data-83569b0b1790191e9a3337e4cd9c3c53\": [{\"Year\": 1858, \"count\": 4, \"date\": \"1858-01-01T00:00:00\"}, {\"Year\": 1859, \"count\": 5, \"date\": \"1859-01-01T00:00:00\"}, {\"Year\": 1860, \"count\": 62, \"date\": \"1860-01-01T00:00:00\"}, {\"Year\": 1861, \"count\": 114, \"date\": \"1861-01-01T00:00:00\"}, {\"Year\": 1862, \"count\": 127, \"date\": \"1862-01-01T00:00:00\"}, {\"Year\": 1863, \"count\": 103, \"date\": \"1863-01-01T00:00:00\"}, {\"Year\": 1864, \"count\": 208, \"date\": \"1864-01-01T00:00:00\"}, {\"Year\": 1865, \"count\": 240, \"date\": \"1865-01-01T00:00:00\"}, {\"Year\": 1866, \"count\": 244, \"date\": \"1866-01-01T00:00:00\"}, {\"Year\": 1867, \"count\": 128, \"date\": \"1867-01-01T00:00:00\"}, {\"Year\": 1868, \"count\": 150, \"date\": \"1868-01-01T00:00:00\"}, {\"Year\": 1869, \"count\": 158, \"date\": \"1869-01-01T00:00:00\"}, {\"Year\": 1870, \"count\": 165, \"date\": \"1870-01-01T00:00:00\"}, {\"Year\": 1871, \"count\": 137, \"date\": \"1871-01-01T00:00:00\"}, {\"Year\": 1872, \"count\": 227, \"date\": \"1872-01-01T00:00:00\"}, {\"Year\": 1873, \"count\": 275, \"date\": \"1873-01-01T00:00:00\"}, {\"Year\": 1874, \"count\": 474, \"date\": \"1874-01-01T00:00:00\"}, {\"Year\": 1875, \"count\": 425, \"date\": \"1875-01-01T00:00:00\"}, {\"Year\": 1876, \"count\": 339, \"date\": \"1876-01-01T00:00:00\"}, {\"Year\": 1877, \"count\": 363, \"date\": \"1877-01-01T00:00:00\"}, {\"Year\": 1878, \"count\": 457, \"date\": \"1878-01-01T00:00:00\"}, {\"Year\": 1879, \"count\": 340, \"date\": \"1879-01-01T00:00:00\"}, {\"Year\": 1880, \"count\": 372, \"date\": \"1880-01-01T00:00:00\"}, {\"Year\": 1881, \"count\": 391, \"date\": \"1881-01-01T00:00:00\"}, {\"Year\": 1882, \"count\": 474, \"date\": \"1882-01-01T00:00:00\"}, {\"Year\": 1883, \"count\": 572, \"date\": \"1883-01-01T00:00:00\"}, {\"Year\": 1884, \"count\": 517, \"date\": \"1884-01-01T00:00:00\"}, {\"Year\": 1885, \"count\": 419, \"date\": \"1885-01-01T00:00:00\"}, {\"Year\": 1886, \"count\": 635, \"date\": \"1886-01-01T00:00:00\"}, {\"Year\": 1887, \"count\": 509, \"date\": \"1887-01-01T00:00:00\"}, {\"Year\": 1888, \"count\": 564, \"date\": \"1888-01-01T00:00:00\"}, {\"Year\": 1889, \"count\": 425, \"date\": \"1889-01-01T00:00:00\"}, {\"Year\": 1890, \"count\": 341, \"date\": \"1890-01-01T00:00:00\"}, {\"Year\": 1891, \"count\": 325, \"date\": \"1891-01-01T00:00:00\"}, {\"Year\": 1892, \"count\": 287, \"date\": \"1892-01-01T00:00:00\"}, {\"Year\": 1893, \"count\": 260, \"date\": \"1893-01-01T00:00:00\"}, {\"Year\": 1894, \"count\": 297, \"date\": \"1894-01-01T00:00:00\"}, {\"Year\": 1895, \"count\": 324, \"date\": \"1895-01-01T00:00:00\"}, {\"Year\": 1896, \"count\": 279, \"date\": \"1896-01-01T00:00:00\"}, {\"Year\": 1897, \"count\": 274, \"date\": \"1897-01-01T00:00:00\"}, {\"Year\": 1898, \"count\": 347, \"date\": \"1898-01-01T00:00:00\"}, {\"Year\": 1899, \"count\": 367, \"date\": \"1899-01-01T00:00:00\"}, {\"Year\": 1900, \"count\": 356, \"date\": \"1900-01-01T00:00:00\"}, {\"Year\": 1901, \"count\": 459, \"date\": \"1901-01-01T00:00:00\"}, {\"Year\": 1902, \"count\": 412, \"date\": \"1902-01-01T00:00:00\"}, {\"Year\": 1903, \"count\": 374, \"date\": \"1903-01-01T00:00:00\"}], \"data-01fa06077bca74b775ca36596df7a92a\": [{\"Year\": 1855, \"count\": 1, \"date\": \"1855-01-01T00:00:00\"}, {\"Year\": 1858, \"count\": 2, \"date\": \"1858-01-01T00:00:00\"}, {\"Year\": 1859, \"count\": 4, \"date\": \"1859-01-01T00:00:00\"}, {\"Year\": 1860, \"count\": 70, \"date\": \"1860-01-01T00:00:00\"}, {\"Year\": 1861, \"count\": 68, \"date\": \"1861-01-01T00:00:00\"}, {\"Year\": 1862, \"count\": 126, \"date\": \"1862-01-01T00:00:00\"}, {\"Year\": 1863, \"count\": 96, \"date\": \"1863-01-01T00:00:00\"}, {\"Year\": 1864, \"count\": 197, \"date\": \"1864-01-01T00:00:00\"}, {\"Year\": 1865, \"count\": 243, \"date\": \"1865-01-01T00:00:00\"}, {\"Year\": 1866, \"count\": 255, \"date\": \"1866-01-01T00:00:00\"}, {\"Year\": 1867, \"count\": 129, \"date\": \"1867-01-01T00:00:00\"}, {\"Year\": 1868, \"count\": 141, \"date\": \"1868-01-01T00:00:00\"}, {\"Year\": 1869, \"count\": 160, \"date\": \"1869-01-01T00:00:00\"}, {\"Year\": 1870, \"count\": 169, \"date\": \"1870-01-01T00:00:00\"}, {\"Year\": 1871, \"count\": 147, \"date\": \"1871-01-01T00:00:00\"}, {\"Year\": 1872, \"count\": 223, \"date\": \"1872-01-01T00:00:00\"}, {\"Year\": 1873, \"count\": 263, \"date\": \"1873-01-01T00:00:00\"}, {\"Year\": 1874, \"count\": 8, \"date\": \"1874-01-01T00:00:00\"}, {\"Year\": 1875, \"count\": 432, \"date\": \"1875-01-01T00:00:00\"}, {\"Year\": 1876, \"count\": 327, \"date\": \"1876-01-01T00:00:00\"}, {\"Year\": 1877, \"count\": 348, \"date\": \"1877-01-01T00:00:00\"}, {\"Year\": 1878, \"count\": 448, \"date\": \"1878-01-01T00:00:00\"}, {\"Year\": 1879, \"count\": 356, \"date\": \"1879-01-01T00:00:00\"}, {\"Year\": 1880, \"count\": 354, \"date\": \"1880-01-01T00:00:00\"}, {\"Year\": 1884, \"count\": 10, \"date\": \"1884-01-01T00:00:00\"}, {\"Year\": 1885, \"count\": 437, \"date\": \"1885-01-01T00:00:00\"}, {\"Year\": 1886, \"count\": 648, \"date\": \"1886-01-01T00:00:00\"}, {\"Year\": 1888, \"count\": 569, \"date\": \"1888-01-01T00:00:00\"}, {\"Year\": 1889, \"count\": 425, \"date\": \"1889-01-01T00:00:00\"}, {\"Year\": 1890, \"count\": 333, \"date\": \"1890-01-01T00:00:00\"}, {\"Year\": 1891, \"count\": 327, \"date\": \"1891-01-01T00:00:00\"}, {\"Year\": 1892, \"count\": 287, \"date\": \"1892-01-01T00:00:00\"}, {\"Year\": 1893, \"count\": 259, \"date\": \"1893-01-01T00:00:00\"}, {\"Year\": 1894, \"count\": 249, \"date\": \"1894-01-01T00:00:00\"}, {\"Year\": 1895, \"count\": 252, \"date\": \"1895-01-01T00:00:00\"}, {\"Year\": 1896, \"count\": 197, \"date\": \"1896-01-01T00:00:00\"}, {\"Year\": 1897, \"count\": 186, \"date\": \"1897-01-01T00:00:00\"}, {\"Year\": 1898, \"count\": 253, \"date\": \"1898-01-01T00:00:00\"}, {\"Year\": 1899, \"count\": 263, \"date\": \"1899-01-01T00:00:00\"}, {\"Year\": 1900, \"count\": 278, \"date\": \"1900-01-01T00:00:00\"}, {\"Year\": 1901, \"count\": 309, \"date\": \"1901-01-01T00:00:00\"}, {\"Year\": 1902, \"count\": 263, \"date\": \"1902-01-01T00:00:00\"}, {\"Year\": 1903, \"count\": 225, \"date\": \"1903-01-01T00:00:00\"}]}};\n", "var opt = {};\n", "var type = \"vega-lite\";\n", "var id = \"8ea95caa-1108-48f3-a42c-480eaed27263\";\n", "\n", "var output_area = this;\n", "\n", "require([\"nbextensions/jupyter-vega/index\"], function(vega) {\n", " var target = document.createElement(\"div\");\n", " target.id = id;\n", " target.className = \"vega-embed\";\n", "\n", " var style = document.createElement(\"style\");\n", " style.textContent = [\n", " \".vega-embed .error p {\",\n", " \" color: firebrick;\",\n", " \" font-size: 14px;\",\n", " \"}\",\n", " ].join(\"\\\\n\");\n", "\n", " // element is a jQuery wrapped DOM element inside the output area\n", " // see http://ipython.readthedocs.io/en/stable/api/generated/\\\n", " // IPython.display.html#IPython.display.Javascript.__init__\n", " element[0].appendChild(target);\n", " element[0].appendChild(style);\n", "\n", " vega.render(\"#\" + id, spec, type, opt, output_area);\n", "}, function (err) {\n", " if (err.requireType !== \"scripterror\") {\n", " throw(err);\n", " }\n", "});\n" ], "text/plain": [ "" ] }, "metadata": { "jupyter-vega": "#8ea95caa-1108-48f3-a42c-480eaed27263" }, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "" }, "metadata": { "jupyter-vega": "#8ea95caa-1108-48f3-a42c-480eaed27263" }, "output_type": "display_data" } ], "source": [ "df_5177 = filter_by_series('5177')\n", "df_5741 = filter_by_series('5741')\n", "\n", "c1 = alt.Chart(get_counts_by_year(df_5177)).mark_bar(size=10).encode(\n", " x=alt.X('year(date):T', title='Year'),\n", " y='count:Q'\n", ").properties(\n", " width=700,\n", " title='Series 5177'\n", ")\n", "\n", "c2 = alt.Chart(get_counts_by_year(df_5741)).mark_bar(size=10).encode(\n", " x=alt.X('year(date):T', title='Year'),\n", " y='count:Q'\n", ").properties(\n", " width=700,\n", " title='Series 5741'\n", ")\n", "\n", "c1 & c2" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }