{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exploring digitised maps in Trove\n", "\n", "If you've ever poked around in Trove's 'map' zone, you might have noticed the beautiful deep-zoomable images available for many of the NLA's digitised maps. Even better, in many cases the high-resolution TIFF versions of the digitised maps are available for download.\n", "\n", "I knew there were lots of great maps you could download from Trove, but how many? And how big were the files? I thought I'd try to quantify this a bit by harvesting and analysing the metadata.\n", "\n", "The size of the downloadable files (both in bytes and pixels) are [embedded within the landing pages](https://nbviewer.jupyter.org/github/GLAM-Workbench/trove-books/blob/master/Metadata-for-Trove-digitised-works.ipynb) for the digitised maps. So harvesting the metadata involves a number of steps:\n", "\n", "* Use the Trove API to search for maps that include the phrase \"nla.obj\" – this will filter the results to maps that have been digitised and are available through Trove\n", "* Work through the results, checking to see if the record includes a link to a digital copy.\n", "* If there is a digital copy, extract the embedded work data from the landing page.\n", "* Sometimes the work data doesn't include the copyright status, if it doesn't then I scrape it from the page.\n", "\n", "Here's the [downloaded metadata as a CSV formatted file](single_maps.csv). You can also [browse the results](https://docs.google.com/spreadsheets/d/1yBPcCk9wIRovRacKbfrlyThWrzGXLF79Lr0GIQbaO9Y/edit?usp=sharing) using Google Sheets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting things up" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DataTransformerRegistry.enable('json')" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import requests\n", "from tqdm import tqdm_notebook\n", "from requests.adapters import HTTPAdapter\n", "from requests.packages.urllib3.util.retry import Retry\n", "from IPython.display import display, FileLink\n", "import re\n", "import json\n", "import time\n", "import pandas as pd\n", "from bs4 import BeautifulSoup\n", "import altair as alt\n", "\n", "s = requests.Session()\n", "retries = Retry(total=5, backoff_factor=1, status_forcelist=[ 502, 503, 504 ])\n", "s.mount('https://', HTTPAdapter(max_retries=retries))\n", "s.mount('http://', HTTPAdapter(max_retries=retries))\n", "\n", "alt.renderers.enable('notebook')\n", "alt.data_transformers.enable('json')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## You'll need a Trove API key to harvest the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "api_key = ''" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define some functions to do the work" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def get_total_results(params):\n", " '''\n", " Get the total number of results for a search.\n", " '''\n", " these_params = params.copy()\n", " these_params['n'] = 0\n", " response = s.get('https://api.trove.nla.gov.au/v2/result', params=these_params)\n", " data = response.json()\n", " return int(data['response']['zone'][0]['records']['total'])\n", "\n", "\n", "def get_fulltext_url(links):\n", " '''\n", " Loop through the identifiers to find a link to the digital version of the journal.\n", " '''\n", " url = None\n", " for link in links:\n", " if link['linktype'] == 'fulltext' and 'nla.obj' in link['value']:\n", " url = link['value']\n", " break\n", " return url\n", "\n", "def get_copyright_status(response):\n", " '''\n", " Scrape copyright information from a digital work page.\n", " '''\n", " soup = BeautifulSoup(response.text, 'lxml')\n", " copyright_status = soup.find('div', id='tab-access').strong.string\n", " return copyright_status\n", "\n", "def get_work_data(url):\n", " '''\n", " Extract work data in a JSON string from the work's HTML page.\n", " '''\n", " response = s.get(url)\n", " try:\n", " work_data = json.loads(re.search(r'var work = JSON\\.parse\\(JSON\\.stringify\\((\\{.*\\})', response.text).group(1))\n", " except (AttributeError, TypeError):\n", " work_data = '{}'\n", " else:\n", " # If there's no copyright info in the work data, then scrape it\n", " if 'copyrightPolicy' not in work_data:\n", " work_data['copyrightPolicy'] = get_copyright_status(response)\n", " return work_data\n", "\n", "def format_bytes(size):\n", " # 2**10 = 1024\n", " power = 2**10\n", " n = 0\n", " power_labels = {0 : '', 1: 'K', 2: 'M', 3: 'G', 4: 'T'}\n", " while size > power:\n", " size /= power\n", " n += 1\n", " return size, power_labels[n]+'B'\n", "\n", "def get_map_data(work_data):\n", " '''\n", " Look for file size information in the embedded data\n", " '''\n", " map_data = {}\n", " width = None\n", " height = None\n", " num_bytes = None\n", " try:\n", " # Make sure there's a downloadable version\n", " if work_data.get('accessConditions') == 'Unrestricted' and 'copies' in work_data:\n", " for copy in work_data['copies']:\n", " # Get the pixel dimensions\n", " if 'technicalmetadata' in copy:\n", " width = copy['technicalmetadata'].get('width')\n", " height = copy['technicalmetadata'].get('height')\n", " # Get filesize in bytes\n", " elif copy['copyrole'] in ['m', 'o', 'i', 'fd'] and copy['access'] == 'true':\n", " num_bytes = copy.get('filesize')\n", " if width and height and num_bytes:\n", " size, unit = format_bytes(num_bytes)\n", " # Convert bytes to something human friendly\n", " map_data['filesize_string'] = '{:.2f}{}'.format(size, unit)\n", " map_data['filesize'] = num_bytes\n", " map_data['width'] = width\n", " map_data['height'] = height\n", " map_data['copyright_status'] = work_data.get('copyrightPolicy')\n", " except AttributeError:\n", " pass\n", " return map_data\n", " \n", "\n", "def get_maps():\n", " '''\n", " Harvest metadata about maps.\n", " '''\n", " url = 'http://api.trove.nla.gov.au/v2/result'\n", " maps = []\n", " params = {\n", " 'q': '\"nla.obj-\"',\n", " 'zone': 'map',\n", " 'l-availability': 'y',\n", " 'l-format': 'Map/Single map',\n", " 'bulkHarvest': 'true', # Needed to maintain a consistent order across requests\n", " 'key': api_key,\n", " 'n': 100,\n", " 'encoding': 'json'\n", " }\n", " start = '*'\n", " total = get_total_results(params)\n", " with tqdm_notebook(total=total) as pbar:\n", " while start:\n", " params['s'] = start\n", " response = s.get(url, params=params)\n", " data = response.json()\n", " # If there's a startNext value then we get it to request the next page of results\n", " try:\n", " start = data['response']['zone'][0]['records']['nextStart']\n", " except KeyError:\n", " start = None\n", " for work in tqdm_notebook(data['response']['zone'][0]['records']['work'], leave=False):\n", " # Check to see if there's a link to a digital version\n", " try:\n", " fulltext_url = get_fulltext_url(work['identifier'])\n", " except KeyError:\n", " pass\n", " else:\n", " if fulltext_url:\n", " work_data = get_work_data(fulltext_url)\n", " map_data = get_map_data(work_data)\n", " if 'filesize' in map_data:\n", " trove_id = re.search(r'(nla\\.obj\\-\\d+)', fulltext_url).group(1)\n", " try:\n", " contributors = '|'.join(work.get('contributor'))\n", " except TypeError:\n", " contributors = work.get('contributor')\n", " # Get basic metadata\n", " # You could add more work data here\n", " # Check the Trove API docs for work record structure\n", " map_data['title'] = work['title']\n", " map_data['fulltext_url'] = fulltext_url\n", " map_data['trove_url'] = work.get('troveUrl')\n", " map_data['trove_id'] = trove_id\n", " map_data['date'] = work.get('issued')\n", " map_data['creators'] = contributors\n", " maps.append(map_data)\n", " time.sleep(0.2)\n", " time.sleep(0.2)\n", " pbar.update(100)\n", " return maps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download map data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "maps = get_maps()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convert to dataframe and save to CSV" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Convert to dataframe\n", "df = pd.DataFrame(maps)\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "single_maps.csv
" ], "text/plain": [ "/Users/tim/mycode/glam-workbench/trove-maps/notebooks/single_maps.csv" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Save to CSV\n", "df.to_csv('single_maps.csv', index=False)\n", "display(FileLink('single_maps.csv'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Let's explore the results" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "# Reload data from CSV if necessary\n", "df = pd.read_csv('single_maps.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How many single maps have high-resolution downloads?" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "20,158 maps\n" ] } ], "source": [ "print('{:,} maps'.format(df.shape[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How much map data is available for download?" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "7.07TB\n" ] } ], "source": [ "size, unit = format_bytes(df['filesize'].sum())\n", "print('{:.2f}{}'.format(size, unit))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What's the copyright status of the maps?" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Out of Copyright 14967\n", "In Copyright 3271\n", "No known copyright restrictions 1506\n", "Edition Out of Copyright 245\n", "Copyright Undetermined 148\n", "Edition In Copyright 12\n", "Unknown 6\n", "Perpetual 2\n", "Copyright Uncertain 1\n", "Name: copyright_status, dtype: int64" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['copyright_status'].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's show the copyright status as a chart..." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n" ] }, "metadata": { "jupyter-vega": "#d617ad97-dbb5-4b3d-b523-bb3eab3a72a8" }, "output_type": "display_data" }, { "data": { "application/javascript": [ "var spec = {\"config\": {\"view\": {\"width\": 400, \"height\": 300}}, \"data\": {\"name\": \"data-287059893b80d73521bd87ec94b84cf6\"}, \"mark\": \"bar\", \"encoding\": {\"tooltip\": {\"type\": \"quantitative\", \"field\": \"count\"}, \"x\": {\"type\": \"quantitative\", \"field\": \"count\"}, \"y\": {\"type\": \"nominal\", \"field\": \"status\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v2.6.0.json\", \"datasets\": {\"data-287059893b80d73521bd87ec94b84cf6\": [{\"status\": \"Out of Copyright\", \"count\": 14967}, {\"status\": \"In Copyright\", \"count\": 3271}, {\"status\": \"No known copyright restrictions\", \"count\": 1506}, {\"status\": \"Edition Out of Copyright\", \"count\": 245}, {\"status\": \"Copyright Undetermined\", \"count\": 148}, {\"status\": \"Edition In Copyright\", \"count\": 12}, {\"status\": \"Unknown\", \"count\": 6}, {\"status\": \"Perpetual\", \"count\": 2}, {\"status\": \"Copyright Uncertain\", \"count\": 1}]}};\n", "var opt = {};\n", "var selector = \"#d617ad97-dbb5-4b3d-b523-bb3eab3a72a8\";\n", "var type = \"vega-lite\";\n", "\n", "var output_area = this;\n", "\n", "require(['nbextensions/jupyter-vega/index'], function(vega) {\n", " vega.render(selector, spec, type, opt, output_area);\n", "}, function (err) {\n", " if (err.requireType !== 'scripterror') {\n", " throw(err);\n", " }\n", "});\n" ] }, "metadata": { "jupyter-vega": "#d617ad97-dbb5-4b3d-b523-bb3eab3a72a8" }, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjwAAADeCAYAAAA5HNo6AAAgAElEQVR4nO2dS4xVVf7vPyUFlrxsQJ4+8IEXaFsM2irB7r+0EiJ/tLV9RI1KStFubdr0w+5WW6VaiXLKtBAhatpgJHCJjfEam1RIh8SQkLohIakBAwYMqic1OpM7qMkd3MG5g99v99m1a+9Tp6gq6nf2+X6SFc5Z+7U+RVXtb621zl4ghBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQrQb//znP//P4OBgTUVFRUVFZTzl3//+97+n+x4mRNNUKpXadLdhMhgcHJRHIOQRC3nEQh5CTAMKPLGQRyzkEQt5xKIsHqJNUOCJhTxiIY9YyCMWZfEQbYICTyzkEQt5xEIesSiLx3SxCHgU2A4smaJrrABWN9i+Dlja5LnmAxszdauAmy6iXRPlFi7ia6bAEwt5xEIesZBHLMriMR1sBC4AnwG/A44Ce6foOk802H4IeDCnvh+Ykan7MTCYqdsD9Fxs4xqwF1jfYHv3GNtzUeCJhTxiIY9YyCMWZfGYDgaBzZm6ncBVWI9LHzAMnACWAT8EjgHHgSrwOtZDNABc4cf/CvgTsAvY7dd4GPiDb3/J674DjgA3YoHnGBa+zvl1/gbUfL80jQLPI8AB4AwwBDzt25cD//I2fwF0AvP8ulXgG/dY4a8PA//Trz8ELAReBM57+7r9vD3ApgbXzUWBJxbyiIU8YiGPWJTF41IzH7uhX1mwvQc4iAWSI8BfgDv9mAeAm7GwsAg4C2zx485gIepzLJhswQLCXr9mFbgNG0KrYcNCh/wcq4FPfN85vn1hpl2NAk83FtBuA3ZgASVx6aUe4h7Egt1R9zgEfAms9Gvu8dcngW3AAj/vat9/GLjcj3ukwXVzUeCJhTxiIY9YyCMWZfG41CSBomjuzAD1eTersZ6NOxl5M/8UG6r6HbAf6wUaBmZigedl368bCzEP+jEJQ9QDz2+9bivwrb+uYb0xadaTH3je8et85nXz/PisyxLgaizMfAS8gPVGDWIhZ5j6MFof8FOgAws63X5MDfgBIwNP3nXp7e3tqVQqtWyhBJTlB08esZBHLOQRi7J4TAdV4PHU+w5sWGkdduNf7PVrsKBzJ3A6tf9B4NfADX6ubizo4P8+5q+7scDzO0bOERqgHniSOTxbqA9j5QWehV4/M1V3GHjWr/Oh1832/TrcZUWqfj4WcPYCv/fyKyzwXEidNwk8V/m5KtjXKy/wZK9biAJPLOQRC3nEQh6xKIvHdPAecAqY6++3Y0NLYCFiu79+E7uh34mFh0VYT8Yg1lsCFl6SISzIDzzbsOA0B+txSQ9pFQWerpx2V1Pnvtbf305x8DhCfd7NUT/2Q+A3Xvdz980LPPf59lOpfWvYMNchFHhaHnnEQh6xkEcsyuIxHcwBPsYCQxUbYvqxb7sLG8Ya8m23UA88g16X7q15zbfN8vefYx93h3rgmeH/VrFgkcyLOcTIwJMMafVjk42zbEq1owa8m7pOXvC4HQtyQ9hQView1uuSdmxgdODp8bau9Gud9TYNYMNgh1DgaXnkEQt5xEIesSiLx3QyC7gGG/7JspL6sNKdwPfYhN1Fmf1+C3wwxnWuwYaEOrFhoqqfq4hOLEDkcRlwPfk9QEX8IKdurHPMT10vGRbrpN4rNm4UeGIhj1jIIxbyiEVZPFqBJPBk+RAbqlqRsy1NJ9ZL8j3WqzIVz84JjwJPLOQRC3nEQh6xKItHK9BJvccjzVWMfkBgI67H5sC0JQo8sZBHLOQRC3nEoiweok1Q4ImFPGIhj1jIIxZl8RBtggJPLOQRC3nEQh6xKIuHaBMUeGIhj1jIIxbyiEVZPESboMATC3nEQh6xkEcsyuIh2gQFnljIIxbyiIU8YlEWD9EmVCqV2oNv/MPK6/84Pt3tuVjK8oMnj1jIIxbyiEVZPESboMATC3nEQh6xkEcsyuIh2gQFnljIIxbyiIU8YlEWD9EmKPDEQh6xkEcs5BGLsng0yyxsgc2XgDVTdI3Z2GKaRVyDLb7ZDPOBjZm6VcBNTR6/ALijyX0ng1uAJRM8x+ZGGxV4YiGPWMgjFvKIRVk8mmEpthbVMeD32MrjJyleYPNiWQK81WB7N/XVwdPsBdZn6n6MrTSeZg/Nr6O1AfiuwfZngeebPFczdDPaYbyMuVq6Ak8c5BELecRCHrEoi0czfAa8l6l7EOsBWQr0YYtyngCWAT/EwtFxbGXy1/2Yb4G7/PWPga+BbcB+oB94BPjKt28AzgFngE99Wze2AOgZYAh40kvN3y9Mta9R4LkOCzNfefs+8u3L3GEQOEo98GzDFim9gAWdVX7cMNarcjNwyttQwdb+Sntt9HZ/5Pvsw0Ja1b+2Hd6uTe55IOX4tLch7xozfd8h4BMUeFoKecRCHrGQRyzK4tEMZ4GtBdt6gIPAjcAR4C/Y6uY14AHsRl0FFgHvewELUO8Cz/i+24FbsWABFhSewMLCMPArLPDUsND0nO97OdbbtA0LDgmNAs8qP88TqbYuwULIl96O77HA0+nt/5l/DYaB/wHscpeZwDfAa8Bt/rXanvGa4+f41PepAZ/712YYWAkcoh7qhn2/HamvR941HgYGsJ6hj1DgaSnkEQt5xEIesSiLRzOcAR4t2DYArPbXq7FemTup36jBbvRPYD1CF7zuPHajfgbrDQILIuexcHQhdfwX1ANP0utyJfUbfB/w00y71pMfeN7x66S3nfX9LwA/8rqn/Fr3+L4veBnwba9hw29XeDte9u1HsV6btBdY4EmGrC5gXyOwYHc3IwPPZ75tnp+76Br7vQ7g2tTXg97e3p5KpVLLFgWeOMgjFvKIhTxiURaPZjiEDZ2k+QT4NdYbsdjr1mCB5U7gdGrfg75vBzb8ssX/7cCCwX7fLwk8dzAyMPVSDzzJHJ45NA48C337zFTdYepDUudS9f3A7e6yzOu2YYHnSSyg/D5V1lEPPFf7ddLbt2a8wALPAn993tuQXDsbeBLH2X7uomt8Azzu+y5APTwthTxiIY9YyCMWZfFohs3YDTu5Sa/DwsEiLERs9/o3sZv1nant87Aekqt9n4q/T4a28gLPIuzmfRN207/A2IHnvpx2V4HH/PW1/v528gPPemxIbgcWxD7BAs8iLJzNxXpaTmCB7DXqE6DP+fEdwG7gFSY38BRdYwc2BNdJfS5TIQo8sZBHLOQRC3nEoiwezfJbLHgMUZ+bAjaf5pzXV7GPVyeBZ9Dr9qbOc5cff5u/fwb42F8ngQds6GbQz3sBu7l3kx94evw66UnLYJOAk3bUsDlDyXXyAs+GlOMZbJI12HydIT/XUSx0bPb392NzaQa9JIEt7QWNA89djB148q6xBJu/VMWG2oZpgAJPLOQRC3nEQh6xKIvHeOjAJth25mxL19+JTfq9HLsxp7kdmzPTiE5sCOxyoAv4F6OHrLLML6i/DLjez9MMncCKgvNn668AZqSOu9GvN1UUXePqZq6rwBMLecRCHrGQRyzK4jEVJIEnyzNYD8qWJs6xF+u1GMQCz+WT1ro2RYEnFvKIhTxiIY9YlMVjKugkv8dlLuN7WOFV1Of+iAmiwBMLecRCHrGQRyzK4iHaBAWeWMgjFvKIhTxiURYP0SYo8MRCHrGQRyzkEYuyeIg2QYEnFvKIhTxiIY9YlMVDtAmVSqUU37Bl+cGTRyzkEQt5xKIsHqJNUOCJhTxiIY9YyCMWZfEQbYICTyzkEQt5xEIesSiLh2gTFHhiIY9YyCMW8ohFWTxEm6DAEwt5xEIesZBHLMriIdoEBZ5YyCMW8oiFPGJRFo/p4lpshfN02Ziz353YU5tvwRbLTJNXNx6uwtb2Gg93Ay8C91BfR2uy+Qkws2DbbL/2uFHgiYU8YiGPWMgjFmXxmC52YmtlvZsqr+XsdwJbybzb/70FOODbkrqLZRtwqsl9ZwBHfP83sPaeAdZM4PpFfIAtw5FHdqX3hGeB5xudVIEnFvKIhTxiIY9YlMVjuthJPbhk+RW2aOgJ4DwWanqATdjNvga8nqrrBPYBw759g5/ne+B9oOrnyq7vlQ48Y+37FNCfqVsHPNrg+nuBj92lH7gBC0uv+PYuLPStBb4BDgN/8usvAJb56wt+/r9igWcI+Nrb2ut1Vb/+ZgpQ4ImFPGIhj1jIIxZl8ZgudmJh5kCq/AJYTP3G/RQWbtYDh4BHvL4fG9pJ6u4DzmK9La9RX6m9ioWOG7Bg8YtMG9KBZ6x99wCVApei6/d5WQrsBj7EVoo/69sTl5XuuQdY7m1ZDLyHBZq1wGngGBZuasBL2HBczc+/CwtsRUNhCjzBkEcs5BELecSiLB7TxU7sJv5iqmwAHsNu7AmDjAw8d1EPFEldL7AjdUwVm59TBa7xug+A32TakA08jfbdjfXW5FF0/T5vH9iq74PALCzQLcN6bV7FAs8w9TlBSeA5R321+OeoB55hrFcJLJytx4LWW/9pUG9vT6VSqWVLQftbirL84MkjFvKIhTxiURaP6aJoSCtbn9zQD1EceI5RDxZggWAJFhwWeN27WLhIkw08jfbtxoaW0jzp1y66fh/wM6+7wesBDvr5hrCwszJz7iTw1LDglLQ1CTzpOTxnyAk8eSjwxEIesZBHLOQRi7J4TBc7gU+wnop0WY3d/K/EPsmVHdK6i/pcmqTueSxEzMA+wZQEgskMPEuxwLLN3y/w62xtcP0+bJ5R4psEua1Yb08ytFUUeAawnh2ALxg78PTQAAWeWMgjFvKIhTxiURaP6eLXWJjJFoBPsXBRxYJAOvAs8m1vpOoWA8exHpMa1vMCI0PMX8kf0vq+yX3B5t+c9n1rwH4spBVdvw8LNhf8mOQTXV3usNPfFwWejVgoqmIh5wjFgWezn/P+nHYDCjzRkEcs5BELecSiLB5RWYIFgzw6sbkwWVZgk5mnmqLrZOv7sGfqLGfkM3u6sBCzaIzrPI7NK5oJvMMYQ1bAFTR4NpACTyzkEQt5xEIesSiLh5g6ksCT5hash+btJo5/gvrQ1xBw/UQao8ATC3nEQh6xkEcsyuIhpo551D9NlTATWDiOc8zGhrEm/FRnBZ5YyCMW8oiFPGJRFg/RJijwxEIesZBHLOQRi7J4iDZBgScW8oiFPGIhj1iUxUO0CQo8sZBHLOQRC3nEoiweok1Q4ImFPGIhj1jIIxZl8RBtggJPLOQRC3nEQh6xKIuHaBMqlUrtwTf+Mar891v/a+V0t208lOUHTx6xkEcs5BGLsniINkGBJxbyiIU8YiGPWJTFQ7QJCjyxkEcs5BELecSiLB6iTVDgiYU8YiGPWMgjFmXxGA/XAvdlysac/W7ClkFYga1+niav7mK4A3gBeHqSzpfmSuqrome5G3gRWxV9wk8/LuAn2BOZ85jt1x43CjyxkEcs5BELecSiLB7jYScwALybKq/l7Pcn4FUsDD3hdf1YQEjXXQydwFFslfC/An/DVhrfPcZxyfWbYRD4c6ZuBrZa+SlspfZ3vQ1rmHw+AOYWbMuulp7wLPB8o5Mq8MRCHrGQRyzkEYuyeIyHncCBgm0bsRvxAPAvLPA8BPwBCyU14LtUHdgNesjLDq97H6hgoeM8cGvmOtuxoNGRqpsHDGO9PhXgXq/fDPRkrp9mKbbA5zBwAliGBZka8H1m36ew0JRmHfAoFsL2+XnOARt8+17gY3fpB27AwtIrvr0L+3qtBb4BDmNh8QSwwNtzAgt0+7CAtwr7en2Nrbje63VVv/5mClDgiYU8YiGPWMgjFmXxGA87sRByIFV+4dvOA7/BbvZVLPB0Yzf9OViIWJiqm+v7bcKGj6rYcM0hLBxcBxzEbvRpPsZu/FlOYL0cR4EHve5RP0f6+ml6fPuNWO/NX3zfYWB5Zt89WJjK4z5sRfM1WI9XEpb6vCzFeqA+BLb4vmDhpB9Y6e3b49etAouB97BAsxY4DRzDwk0NeAm43V8vBXZhYbFoKEyBJxjyiIU8YiGPWJTFYzzsxG68L6bKBuwmPQxc5vt9xsjAA3Zj7kzVbcNCRsJhrPfnENabAvAA8G2mDZ+QHzzOAU+SH3jS108zQH3+z2rqQ0XDwBWZfXdjYSuPXuo9VGCB5Sos7DzidVdjPT2z/PzLsDD3KhZ4hqkPuSWB55wfB/Ac9cAznHIZANZjQeut/zSot7enUqnUskWBJw7yiIU8YiGPWJTFYzwUDWmtw4ZdEiqMHXheTm0DCyq/wAJPElg2M3oY6g9Yb1KahX7+9YwMPDsoDjwdWHBY7O/XpM6bF3i6M45gAeuYl0dS9cPAEizw/MzrbvB6vE3d2NDUSi/pcyeBp4YFJ7CAmASe9ByeM+QEnjwUeGIhj1jIIxbyiEVZPMbDTqyHpTNTLsNu0muwoHCO/MDTlaq7GQsYc7EeoiFgEWMHnhV+rof9fQcWsJJhok+BP/rrY4wMPF2Zcx3G5gQBvIkNOUF+4Fnq9cmntxa451ZsLtJBrIfmHuqBpA8bNoORYXEr1tuTtLko8AxgPTsAXzB24OmhAQo8sZBHLOQRC3nEoiwe4+HXWHDIFrCb/jB2sz6Lzefpph54+rHJzOm6fdQn2+73ukOMDDzZIS2w0JFca9jPncy52YAFqWSicBJ4kuunuQsLD0N+rlu8Pi/wgM2/Oe371rzNnVg4Oe7nqWE9P2CBZxALM0kgBAtew1gIguLAsxH7Wla9nUcoDjyb/Zz357QbUOCJhjxiIY9YyCMWZfGYTOZhPR95dGKTkrMswHp2xstl2GTjvGNnYJOPm7k+WODIzu9pxIqCc2Xr+7Bn6ixn5Efiu7AQM5b348A12ETkdxhjyAoLaYUfvVfgiYU8YiGPWMgjFpPhkYSDhdg8GFEuksCT5hash+btJo5/gvrQ1xD2MMeLRoEnFvKIhTxiIY9YTNTjGWz441rqQ0OHJt4sEYh5jO45msnoj8c3YjY2jDXhpzor8MRCHrGQRyzkEYuJelzAntr7ARZ2Dvi/RU/YFWJCKPDEQh6xkEcs5BGLiXh0YeFmCza8cQGbj1JDQ1tiilDgiYU8YiGPWMgjFhP1OEX9Uz092KeR0g+UE2JSUeCJhTxiIY9YyCMWE/XYgH3E+TT2KZ5B7Nk1QkwJlUpFP3iBkEcs5BELecRisj2yD8UTYlJR4ImFPGIhj1jIIxYT9fgem7+TLdnnxwgxKRQNaak0X7a9/tX/nqz/D/0ijIU8YiGPWEzGHJ7zXpK5PMmK4UJMOgo8CjxTgTxiIY9YyCOfXVjg0dCWmBIUeBR4pgJ5xEIesZCHsQxbimAF9vDB97FenlUTb5oQo1HgUeCZCuQRC3nEQh5GsgBlugxjT+IVl4argNvHeczdwIvYqugTfvpxAT+h+Ptgtl973CjwKPBMBfKIhTxiIQ9jF/Chlwq2EvkNk9Au0TzbsLlUzTADW638FPAG8C62UvmaRgddJB9Q/MTt7GrpCc9iK9YXosCjwDMVyCMW8oiFPIzDwH+l3l+D3UyXTuSkYlykA8/32LBiFTgBzM/s+xTQn6lbBzyKPSxyH9ZDdw57xhLAXuBj7BlL/VigfQN4xbd3AQPAWuAb7HviT379Bdiw5wnsSdz7gL9igWcI+Nrb2ut1Vb/+5iJZBR4FnqlAHrGQRyza3aMbu8nVsBvXgJfkk1rjWVhSTIx04KliAeUG7P/jF5l992A9cXnch61ovgZ4DQtPYKul92EhdjfWm7fF9wULJ/3ASuz/fg/2EMoqsBh4Dws0a7EHVB7Dwk0NeAkbjqv5+Xdhga1wSFSBR4FnKpBHLOQRi3b32E5+4BnAbm7i0pENPNf46w+A32T23Y311uTRC+xIva9i84P6gEe87mqsp2cW1hOzDOu1eRULPMPU5wQlgeecHwfwHPXAk16CZABYjwWtt/7ToN7enkqlUsuW6Q4MrV4UeEYjj1jIIxbyMN4D7piktoiLIxt4Fvjrdxm9zEc3NrSU5kkshByjHmzAAskSLPD8zOtu8HqAg36+ISzsrMycOwk8NSw4JW1NAk96Ds8ZcgJPHgo8CjxTgTxiIY9YyMO4ApuovM/Lx8BJiierislnPIFnKRZYtvn7BVjw2IpNFj6I9dDcQz2Q9GELwwLsBA74661Yb08ytFUUeAawnh2ALxg78PTQAAUeBZ6pQB6xkEcs5GH0Mfpj6TUUeC4l26jPt0kHnr8yekgLbP7NaeqPFNiPDS0tBo5Tn4f1pO/fhwWbC35M8omuLiw87fT3RYFnIxaKqljIOUJx4Nns57y/SFaBR4FnKpBHLOQRC3nYTbIG/BmbtPpn7C/0AaBjUlonppIV5C8Bkq3vw56ps5yRz+zpwkLMojGu8zg2r2gm8A5jDFlhvYaFzwZS4FHgmQrkEQt5xEIecBkWePZin6w5AfzQ66biuS5iekgCT5pbsB6at5s4/gnqQ19DwPUTaYwCjwLPVCCPWMgjFvIwDmMB5yk0pFVW5lH/NFXCTMb36IHZ2DDWhJ/qrMCjwDMVyCMW8oiFPIzZwEP+bzc2ofXeiTdLiHwUeBR4pgJ5xEIesWh3j8uBH2BDFc/56x9gn+6pAddNVgOFSKPAo8AzFcgjFvKIRbt7/Jn8T2clJW8yrBATRoFHgWcqkEcs5BGLdvf4GbZEwTD2UeZKqmSXMxBi0qhUKm39gxcNecRCHrGQRywm6vEQI4evspNbhZhUFHhiIY9YyCMW8ojFRD3WYw+xm4s9PG6Y+iraQkw6IYe0Xv/q6Hg99AskFvKIhTxiIQ/jDPbwud9gc3cGGLkopBCTigJPLOQRC3nEQh6xmIhHFxZyHsKWNqgC13rduklpnRAZFHhiIY9YyCMW8ojFRD0uAN9hIefv2PpNNWDOhFsmRA4KPLGQRyzkEQt5xGKiHtupfxR9HTac1eiX/wzgPkauv7QAuL3J691KfXHMMvIT7CnGeczGnnM0HcwGNkzifheNAk8s5BELecRCHrGYDI8FwHx/PVZwmY2Fo+9SdfdSX+17LI4Cm8bTuBbjA4qX5ciuMJ7wLPD8JLahn9FLQCyh8aKfe7EJ7GPtN2EUeGIhj1jIIxbyiMWl9kgHnke9Lh14nscWmBwCduQcnwSeK7HFSrdiC5dWsKc+n8d6gfLOtQK7mXf4Nc9gC6DeBnwOPIItjXHGj3k65/q/xOYqnQW2NGjzXuBjb1M/cAPwBvVPsHVhE7zXAt9ga5L9yZ0WAMv89QVgHzZUuMqv8bW3odfrqljP2uZUO1dkzjsPOOT7fkO9h223n3MA+BHwN+r/P9uA/d7+R4Cv/JgN/nUeBH4HPOnHDPk5kv3uwgLasH9dOxt8jZ/yfQexxUYLUeCJhTxiIY9YyCMW0xV4bsRueD+gHnjmYjfkTdjNtsroJzYfxULOKWCX1x3CbsrXAQexgFB0riHgJj+2BqwGXsVW/e7Gbs63YcHlfObayTlXAw9gN+ii6/R5WYqFig+xgHTWz7XZ27zS27EHWO7HLwbewwLNWuxj/8ewcFMDXsJ60mp+/l1Y6EsPhWXPu9O/djf71+tL4G53nI8FsU+xuVc1bGHQZ/z1dixEJl+Pfuzhkjf51/Ma4KT7r8rs9zI21DmADdflfY2v8LqVWEjKft1HoMATC3nEQh6xkEcspivwgN2oP6UeeLYBR1L7HsY+AZbmKBYKBrGeGrAb+FP++gHg2wbn+rvvexLriXgO6/HYiN2MP/P956XambDNz5OwHniw4Dp9WG8GwNXe3lnYjX0ZFspexW7yw9SHkJLAc86Pw9uYBJ70R/4HvA2vMXoYKXvek8BHwAvY130QG34axnpx7k+dt+avn/HrQj3ILPA2JtyE9bb1AT9N7bcws98LWO9RN6O/xh1YEDyO9e7MSw7q7e3tqVQqtWyZ9oCjwPMf5BELecRCHrGYzsCT9LjswgLPy9hQUMJRRi9TcRQLFYPAY153CAseYD0n3zU415PAJ9hQ0RZsKGsICyPdWE9Mtp0Jr2DDMQnLGlynD1t+A2w4a9hfH/TrDGGhZKW3JSEJPDXgKq/bRj3wpOfwnKFx4Emfd9Db+Xsvv/L6H2LDZYMpt3Tg2e91SZC5NXPeBdhCstnA8yNG9tS8iIW8bvK/xkuwAHjaz38ZBSjwxEIesZBHLOQRi+kMPAA/9/ffY8Mt57FhouVYKFiUOT6Zw7PNt88mP/AUnWuZX+8wNpxWw3oWYOzAsxq7Gc/FhprON7hOH9Djx+2kHia2YuEiGdoqCjwDWM8OwBeMHXh6GEn2vB9iD4cE+5ofxr6GH3jdOm8X7t1FfuC5zNt4MxZ0BrGeqD7s03fJfh2+3yo/13Hs/6ib0V/jpVgPVCfWIzWMzUHKRYEnFvKIhTxiIY9YTHfgAZucm0xa3kd9Eu5+RnMUGwIDu0nuYnTg+XaMc53HJh+DBYud/rqbxoEHbHJ0MkE5GbLKu04fFgYu+LY1Xt/l+yXXLAo8G7FQVMVCzhGKA89mP+f9qW3Z8671813wfTdgQ0cX/JznqX/Sqx/4FxZ4Pva69Nycp9xtCJufBBa4qsAdqf22+z5Vb+tcir/GR6hPBv87DVDgiYU8YiGPWMgjFhE9FjC6ZyfCuRLmMXrpjOx1+rBJussZ+RHvLuzGPlabHscmA88E3mHsj3pfweiPkudxvbch4TKs52pWqq6T0ZPFs8zK2Wd+wX7XUp9v1YiVWK9bQxR4YiGPWMgjFvKIRVk8opEEnjS3YL0pbzdx/BPUh76GsKAiUOCJhjxiIY9YyCMWZfGIRl4v0Ezs00vNMhsbSmqm56ZtUOCJhTxiIY9YyCMWZfEQbYICTyzkEQt5xEIesSiLh2gTFHhiIY9YyCMW8ohFWTxEm6DAEwt5xEIesZBHLMriIdqESqVSim/YsvzgySMW8oiFPGJRFg/RJijwxEIesZBHLOQRi7J4iDYh5JCWioqKispFl/9+/etbL8X9Q4FHtBQKPCoqKirlKgo8QuSgwKOioqJSrqLAI0QOCjwqKioq5Z8ThWEAABC+SURBVCoKPO3BHcALwNPYmlaTyZXYiuh53A28CNzD1D3J+SfY06XzmO3XHjcKPCoqKirlKgo85aYTW/n9DPBX4G/YyuW7GxwDtpJ5swFlEPhzpm4GtjL5KeAN4F1vwxomnw+wFdLzyK78nvAs9VXbc1HgUVFRUSlXUeApN9uxoJFeRXweMIz1+lSAe71+M9CDhaIa8F3mXEuxxUqHgRPAMizI1IDvM/s+hYWmNOuAR7EQts/Pcw7Y4Nv3Ah9jAaofuAELS6/49i5gAFgLfAMcBv7kbVng7TmBBbp9WMBbhS2K+jW2enyv11X9+pspQIFHRUVFpVxFgafcfIzd+LOcwHo5jgIPet2jwEFgDhZisguQ9vj2G7Hem7/4vsPA8sy+e7Awlcd92Orsa4DXqIelPi9LsR6oD4Etvi9YOOkHVnr79vh1q8Bi4D0s0KwFTgPHsHBTA14CbvfXS4FdwPsUD4Up8KioqKiUrCjwlJtPyA8e54AnyQ88YMEguwr7APX5P6upDxUNA1dk9t2Nha08eoEdqfdV4Cos7DzidVdjPT2z/PzLsF6bV7HAM0x9yC0JPOf8OIDnqAee4ZTLALAeC1pv/adBvb09lUqlli3T/cOpoqKiojJ5RYGn3PwBOJ+pW4gFmvWMDDw7KA48HVhwWOzv16TOmxd4urGhpTRPYiHkGPVgkxy/BAs8P/O6G7web1M3NjS10kv63EngqWHBCWwSdRJ40nN4zpATePJQ4FFRUVEpV1HgKTcrsCDwsL/vwHp8kmGiT4E/+utjjAw8XZlzHcbmBAG8iQ05QX7gWer1yae3FmDBYys2Wfgg1kNzD/VA0ocNmwHsBA74661Yb0/S5qLAM4D17AB8wdiBp4cGKPCoqKiolKso8JSfbVj4SCbq9lOfc7MB66lJJgongacf+FfmPHdh4WHIz3WL1+cFHrD5N6d93xqwH+s1Wgwc9/PUsJ4fsMAziIWZKvVPdHX5NXb6+6LAsxELRVVv5xGKA89mP+f9Oe0GFHhUVFRUylYUeNqDy7DJxotyts3AJh+n6cSeYZPHSkbP72nEioJzZev7sGfqLGfkR+K7sBCT1/Y0jwPXYBOR32GMISsspBV+9F6BR0VFRaVcRYFHRCEJPGluwXpo3m7i+CeoD30NAddPpDEKPCoqKirlKgo8IgrzGN1zNJPRH49vxGxsGGvCT3VW4FFRUVEpV1HgESIHBR4VFRWVchUFHiFyUOBRUVFRKVdR4BEiBwUeFRUVlXIVBR4hcqhUKqX4hi3LD548YiGPWMgjFmXxEG2CAk8s5BELecRCHrEoi4doExR4YiGPWMgjFvKIRVk8RJugwBMLecRCHrGQRyzK4iHaBAWeWMgjFvKIhTxiURYP0SYo8MRCHrGQRyzkEYuyeIjmmAvclyqrprc5o+jCFk4tRIEnFvKIhTxiIY9YlMVDNMc6bDXyt7B1sL4GTmCLmE42e7EV0MdDdsX1USjwxEIesZBHLOQRi7J4iOZYhy3gmTATqGGLgd4MnPLtFWz9rG3AfqAf2AicwYJMFTiCrWzeAfzFjzsD3A486ecdwkLP16lrfoutyL7cX1eB77C1uRR4Wgx5xEIesZBHLMriIZoj6eHZBjyEhZkq1sPzDfAacBu2svl24BksuGwH5vjrXcBS4CTwoJ/zPBaafoeFnC7fvg1Yi62snnABWzF9D/AhMB/4xK+twNNiyCMW8oiFPGJRFg/RHOuw0PKtl93AdVhPTQ14GXgBOAp8hgWeY6nja8Bif/0kFlTeBPr8uBd8n2u97qcUB54uYCvwB2AA61UaEXh6e3t7KpVKLVsm4esw7ZTlB08esZBHLOQRi7J4iObIDmklXI0Fld+nylYs8OxP7VcDLvfXz2LDWp9iQ1LpYxdQHHiGscDTi/Uk7QQOkBN48lDgiYU8YiGPWMgjFmXxEM1RFHjAQsl6bE7ObuAV8gPPfcAMrOfnDWxY62uvuw4b3pqFBZ77sDA1DMwDfuTnuB4LNvf6cQNYAFLgaTHkEQt5xEIesSiLh2iOdcBgwbaHfdsgFloWYYHn49Q+NSwYDWEh5UpgNhZ4Bn37Tt+3B5sftJD6XKEB//d64Ld+nvNYT1EVuBMFnpZCHrGQRyzkEYuyeIjJoRO4keKPqdd8nxU525ZhQ1lp5he8TliIBSaw8NQxVgMVeGIhj1jIIxbyiEVZPMSloYYNQU0bCjyxkEcs5BELecSiLB7i0pDtwbnkKPDEQh6xkEcs5BGLsniINkGBJxbyiIU8YiGPWJTFQ7QJCjyxkEcs5BELecSiLB6iTVDgiYU8YiGPWMgjFmXxEG2CAk8s5BELecRCHrEoi4doExR4YiGPWMgjFvKIRVk8RJugwBMLecRCHrGQRyzK4iHaBAWeWMgjFvKIhTxiURYP0SYo8MRCHrGQRyzkEYuyeIg2QYEnFvKIhTxiIY9YlMVDNM98YGOmbhVwU4NjbiXAU5ZBgSca8oiFPGIhj1iUxUM0z48ZvWL6Hmx18yKOApumqkHjQYEnFvKIhTxiIY9YlMVDNE+jwHMd8B3wFVAFPvLtSeC5EjgBbAXeByp+rvNYLxDA88CQlx3Yyur92Ero9wJnsNXYbwM+Bx4BDnj9EPB0o8Yr8MRCHrGQRyzkEYuyeIjmaRR4VmEroj8B3Omvl2CBZytwCtjlxxzCgsx1wEFgHzAXC0qbgG3+ejYWZG7yY2vAauBV4G2gGxjGAtAOLDwVosATC3nEQh6xkEcsyuIhmmc9+YHnHSzwpLed9f2PYuFlEOupAQs8T/nrB4BvsZBzJHX8YeAh4O++70ms9+g54BtsLlE38JnvPw8LRAD09vb2VCqVWrZcjHQ0yvKDJ49YyCMW8ohFWTxE8yzEQsXMVN1h4Fks8JxL1fdTDzyHscDzmG87BDzorzdjQ2EvA3tTxx8FfgE8CXwCXAC2YENZQ8AsLPB86PvPJhV48lDgiYU8YiGPWMgjFmXxEOOjSj24XOvvb6dx4NmE9eAMYcHkEKMDz83YkNRcYLnvuwhYhgWZw8AP/PVxP7YbBZ6WRR6xkEcs5BGLsniI8bEJmzcziAWMd72+UeC51+tOYnNxDjEy8Hzrr/dhAWoY2J8613ngl/56ANjpr7tR4GlZ5BELecRCHrEoi4cYP5cB1wNdU3DuBVjPzqSjwBMLecRCHrGQRyzK4iHaBAWeWMgjFvKIhTxiURYP0SYo8MRCHrGQRyzkEYuyeIg2QYEnFvKIhTxiIY9YlMVDtAkKPLGQRyzkEQt5xKIsHqJNUOCJhTxiIY9YyCMWZfEQbYICTyzkEQt5xEIesSiLh2gT9u/f///ylptQUVFRUVFpVA4ePPh/p/seJkTTVCrl6OGRRyzkEQt5xEIeQkwDZfmGlUcs5BELecRCHkJMA2X5hpVHLOQRC3nEQh5CTANl+YaVRyzkEQt5xEIeQkwDvb29PdPdhslAHrGQRyzkEQt5CCGEEEIIIYQQQgghxES4DJgz3Y3IYQEwo8H2y4FZOfVzMKc00+24EGtvHq3iMQeY12B7q3jMAzoabI/uMT+nrtH/C+S3vei4Iv/JJs9j0RjHtIoHwIoGx0yFRycwO1PXgf0ebcR4vq+L2i1ES/A8cA44AZwEFk9vcwC4DjgDHAe+Bd7ObO8EDgD9vt8n2A/hVZjDCeA80O37T7fjSmAY2JipbxWPLuAo0If9n7yb2d4qHguAb4AvMZfuzPboHquAHd6GhDu8Dcf93x9njilqe95xRf6TTZ7HfcAF4Ctv088zx7SKR8KDQI3RwXoqPGYAtwH7gX2p+i3AAHAY+A5Y3WRb8r6vi/YVomXoxH4or/T3+4E3p685/+Ft6jfVLqyN6b+WNgJnU+8vAPcAbwDve90yP24e0+s4C/tlc47RgadVPLqBT/11B/AoI3veWsnja3+90duZJrrHG1jwTN9gT2I3NoDH/H32mGzbZxccV+Q/2eR5nAI2++v/YvT/Tat4AFzv9XmBZyo85gJ7gdOMDDxV4Bp/vRa4pYm2FH1fF7VbiJbhBmAw9f5V4ItpakuaK7CgA/Aw1sb0L45ngc9S778DngMOAk95XQf2Q/lfTK/jXmAb8C9GB55W8XgP+8uuiv3FuDWzvVU8lmMOXwFDwGuZ7a3gsY6RN9gh4Fp/vR7zS5PX9hsLjivynwqyHvOoh+i9mXZA63hcDnyPBY28wDOVHjupB57Zfu5vsd7lT7FgNFZbir6vi9otRMuwjpF/SSW/3CMwC+jBfljvy2z7NdbVm/AF9svha+DxVH0VeIDpc3wM606G/MDTKh5fYr8EV2O/9IYY+Yu8VTy2eBt2Y39B92W2t4JH9gY7jAU5qN840+S1/fqC44r8p4KsB8BSbMjxHDYMnKZVPD7GhoQgP/BMpUc68Kzy6//ZX5/MOXY839dF7RaiZbiCkT+Uv/cy3XRhY9nfUP8lkGaTb084jo157wJ+53UzsF8iyV860+F4BgsKZ70NFxg5x2ITreGxF9iTel/FfokmbKI1PL4EXvfXSdf9ktT2TcT3yN5gT2PzP8Daejyzf17bLys4bhP5/lNB1uM67Gelh3rvbppW8JiDfU+cpf4zf5aRE46n0iMdeK7y6ycTqrcDhzL7j+f7uqjdQrQU57Bx4fnYDXpL490vCb9k9C9usHHoZdgnOYaxSag3+esrsYmOyRyGJzAfmD7Ha4GbvZzGekdm03oez2Htn4X99VnFfum1mscfsa79Duyv0yo2j62VPLJB4SNsjsUM7Ga32+s3YN9rRW3PO67I/1J4fAX8IWe/VvLooP7zfjMWHFZ7/aXwSAeeDqyXaKu//hybYE0Tbcn7vi7aV4iW4ufYD9Iw9kun0cd1LxVfYr8s0uVmLAS95PtUsDbXsB90sB6rU9iNbBi42+sjOPZRH9JqNY/Lgb9jv0AvUB/LbzWPpdhf3ENekra3kkc2KNzk7api/zcLvb6GBbmithcdl+c/FWQ9qoz8eU/mIrWaR5p0T8ml8NiJ9cYmbPa2DQHHaP57I+/7umhfIVqO2eQPHUXjWeqf5ADrts17Bsa1wMxMXSTHVvXIPhepVT1WMLI9reqR0IkNCaUD1+eMfKZLXtvzjoNi/+lAHhPz6KAedMbTlqLv67x9hRBTwE8px7ixPGJRFo809053AyYJeUw+kdoihBBCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQgghxEhuBd6iviaSEEIIIUTpeBp7qFt2zTQhhBBCiCnnOmzV6Cq26vtDXn8F9ij+5OmxX2NPYwZ7+vKX/voa7AnNO7DVpM8Cvdgj+M/4+W7EnpBbw1aUT681JoQQQggx5SRLlXyJhZJhYC7wotcfxR7JXwM+8WOGqK8XlKw8/Tb22P2an+ND/3cQewr1Qd9WYfTTbYUQQgghpoxZWCg57O+vBx7GenJOMXLV55P+fhZjB55nfNvf/f1CNKQlhBBCiGliMRZCDvj7BdiCj/OwIamh1L7f+r5zvH7A69cxOvBs9W0HUOARQgghRAD6sZ6b7diq9TVsYcQ/+us3gef99Vd+zEl//2LqdbOBZyfQNcVOQgghhBAjeAALIknZ6fWLsTCU1A8BP/Jtj6Tqv/d/36IeeB7w/fZTDzyrsWBVA26bSiEhhBBCiDwuB24GZudsW47N7cmumD6T8U8+nokNm80Y53FCCCGEEEIIIYQQQgghhBBCCCGEEEIIIYQQQgghhBiL/w/GT/RUbOg0fAAAAABJRU5ErkJggg==" }, "metadata": { "jupyter-vega": "#d617ad97-dbb5-4b3d-b523-bb3eab3a72a8" }, "output_type": "display_data" } ], "source": [ "counts = df['copyright_status'].value_counts().to_frame().reset_index()\n", "counts.columns = ['status', 'count']\n", "alt.Chart(counts).mark_bar().encode(\n", " y='status:N',\n", " x='count',\n", " tooltip='count'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at the sizes of the download files.\n", "\n", "So while most are less than 500MB, almost 5,000 are between 0.5 and 1GB!" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n" ] }, "metadata": { "jupyter-vega": "#5bcfa0e3-78bd-4117-9996-e18d50e6bb8a" }, "output_type": "display_data" }, { "data": { "application/javascript": [ "var spec = {\"config\": {\"view\": {\"width\": 400, \"height\": 300}}, \"data\": {\"url\": \"altair-data-66980e5ca028c9f4bae3848ab0314d98.json\", \"format\": {\"type\": \"json\"}}, \"mark\": \"bar\", \"encoding\": {\"tooltip\": {\"type\": \"quantitative\", \"aggregate\": \"count\"}, \"x\": {\"type\": \"quantitative\", \"bin\": true, \"field\": \"mb\", \"title\": \"MB\"}, \"y\": {\"type\": \"quantitative\", \"aggregate\": \"count\"}}, \"$schema\": \"https://vega.github.io/schema/vega-lite/v2.6.0.json\"};\n", "var opt = {};\n", "var selector = \"#5bcfa0e3-78bd-4117-9996-e18d50e6bb8a\";\n", "var type = \"vega-lite\";\n", "\n", "var output_area = this;\n", "\n", "require(['nbextensions/jupyter-vega/index'], function(vega) {\n", " vega.render(selector, spec, type, opt, output_area);\n", "}, function (err) {\n", " if (err.requireType !== 'scripterror') {\n", " throw(err);\n", " }\n", "});\n" ] }, "metadata": { "jupyter-vega": "#5bcfa0e3-78bd-4117-9996-e18d50e6bb8a" }, "output_type": "display_data" }, { "data": { "text/plain": [] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdAAAAFbCAYAAABlKt8bAAAebklEQVR4nO3dUYxV52Hg8T8zgGEwYFzIgmM3hHXk2F3bcpo1rNlid2NROyHTRjGKo4QK1Ox2Gy9bSDcxbFxwGtmdqWoiYzWVEVYsV9a4qEotZKEKKUKyLGFZsiUeeGCl0T7wNK/z0qfV3Yfvu713zpwzcL9z57v3nvn/pE9z59xvZs7xvZ4/58y554IkSZIkSZIkSZIkSZIkSZKk4bIamKhYvm2Jr9sAjBWWjcXltzNXkqSRNA48CpwFflFy/2vA5ZLlW+PyS8B14HBcfgS4FpdfJsS3aq4kSSPrTuAM8CGLAzpJJ4RFJ4BX4u3tQAvYGD9ujsvPAicr5pbt7UqSNHJeYGFAdxHCuY/ygJ4Hno+3VxGiuA+Y7ZpzFHirYu6ufq24JEmD1B3QdcBV4CHgCcoDegF4ruvzOeAZ4EbXskOEeJbN3QkwPT19empqqtU9zp8//6+zs7Mth8PhcDh6HP+3fg571x3Q/YS9xE8IQWwBbxbmnwKOxdvjwDzhsGyLsJcJcDyOsrmVJxNNTU21amyHJGmFmp2dHUg/ugM6AXwpjoOEvdF743174v2TdPZM23MgnEC0F9gUl+1fYm6pqamp1oET7zV7/PjdL9V7uCRJRYMM6JmS5Y+z8BBuC3gQWA9cIRyOnQd2x/sn4+fzwAxhb7RqbikDKklKMaiA3q5zhLNt2+4D1hTmTAA7Sr62bO4iBlSSlGLYA/rkcv8AAypJSjHsAV12BlSSlMKAGlBJUgIDakAlSQkMqAGVJCUwoAZUkpTAgBpQSVICA2pAJUkJDKgBlSQlMKAGVJKUwIAaUElSAgNqQCVJCQyoAZUkJTCgBlSSlMCAGlBJUgIDakAlSQkMqAGVJCUwoAZUkpTAgBpQSVICA2pAJUkJDKgBlSQlMKAGVJKUwIAaUElSAgNqQCVJCQyoAZUkJTCgBlSSlMCAGlBJUgIDakAlSQkMqAGVJCUwoAZUkpTAgBpQSVICA2pAJUkJDKgBlSQlGFRAVwMTJcvuusXXbQDGCsvG4vLbmbuIAZUkpcgd0HHgUeAs8Iuu5T8CbgDvAO8DDxS+bitwGbgEXAcOx+VHgGtx+WVg2xJzSxlQSVKK3AG9EzgDfEgnoGuBFp29yJeANwpfdwJ4Jd7eHudvjB83x+VngZMVc4t7u//GgEqSUgzqEO4LLNwD3RI/TgBXgecL8893LVtFiOI+YLZrzlHgrYq5u6pWxIBKklIMS0ABvkI4HPsusK5w3wXgua7P54BnCId92w4R4lk2d2fVihhQSVKKYQno1wih+07F/FPAsXh7HJgn7K22CHuZAMfjKJs7BjA9PX16amqqVRwDD9wyj48/u96anZ11OBwOR59Hf5LYm+6AroqRe7xk3h5CKCcJJwYBHCQc5oWwx7oX2BSX7V9ibqmVEFD3QCWp/wYZ0DPx9v2EPcnu8Xa8rwU8CKwHrhD2UueB3fH+yfj5PDBDiHHV3FIGVJKUYlABvV3nCGfbtt0HrCnMmQB2lHxt2dxFDKgkKcWwB/TJ5f4BBlSSlGLYA7rsDKgkKYUBNaCSpAQG1IBKkhIYUAMqSUpgQA2oJCmBATWgkqQEBtSASpISGFADKklKYEANqCQpgQE1oJKkBAbUgEqSEhhQAypJSmBADagkKYEBNaCSpAQG1IBKkhIYUAMqSUpgQA2oJCmBATWgkqQEBtSASpISGFADKklKYEANqCQpgQE1oJKkBAbUgEqSEhhQAypJSmBADagkKYEBNaCSpAQG1IBKkhIYUAMqSUpgQA2oJCmBATWgkqQEBtSASpISGFADKklKYEANqCQpwaACuhqYKCwbAzbc4us2xHm383VlcxcxoJKkFLkDOg48CpwFftG1/AhwDbgEXAa2Fb5ua1x+CbgOHF7i66rmljKgkqQUuQN6J3AG+JBOQFcDLWBz/PwscLLwdSeAV+Lt7XH+xoqvK5tb3Nv9NwZUkpRiUIdwX6AT0C8Cs133HQXeKsw/Dzwfb68iRHFfxdeVzd1VtSIGVJKUYhgC+ghwo+u+Q4QIdrsAPNf1+RzwTMXXlc3dWbUiBlSSlGIYArqesJe4Kn5+PI5up4Bj8fY4ME84LFv2dWVzxwCmp6dPT01NtYpj4IFb5vHxZ9dbs7OzDofD4ejz6GcYb1d3QCGcCLQX2ARcBfbH5XsIoZwknBgEcDDOqfq6qrmlVkJA3QOVpP4bZEDPdH0+SdhTnAdm6OxVtoAHCXupVwiHY+eB3Ut8XdXcUgZUkpRiUAEtMwHsKCw7Rzjbtu0+YM1tfF3V3EUMqCQpxTAFtMyTy/0DDKgkKcWwB3TZGVBJUgoDakAlSQkMqAGVJCUwoAZUkpTAgBpQSVICA2pAJUkJDKgBlSQlMKAGVJKUwIAaUElSAgNqQCVJCQyoAZUkJTCgBlSSlMCAGlBJUoJ+BHRL/Hg38Ejdb5abAZUkpagb0O8R3vT6vvixBbxdf7XyMaCSpBR1A3oDuAK8SojnG/HjnfVXLQ8DKklKUSeg6wix3A9cI8R0V1w2ModyDagkKUXdPdArwE1CNE8DvwbmgdX1Vy0PAypJSlE3oHuAD+PYAcwCR/uwXtkYUElSin6/jGVdP79ZDgZUkpQiNaA3gbklhicRDdMwoJLUd6kB/QC4TDhk2yL83fNavD0LrO/XCi43AypJSlH3EO514FfAHfHzPyNEdGPN9crGgEqSUtQJ6GpCLN8HxuOyv4jLHqu/ankYUElSirp7oBfpHMKdi7c/Acbqr1oeBlSSlKJuQDcDPyDshX4K/G/CZf1GhgGVJKWoewj3feBY/1YnPwMqSUpRdw/0A8Lh2839WZ38DKgkKUU/LuXXfheW2a6xof6q5WFAJUkp+rEHerVkTNRftTwMqCQpRb8u5bcJ2NaPb5SbAZUkpagb0F3ABTqHcS8D+/qwXtkYUElSiroBbcfzH4G3CCcUzeMh3OEaBlSS+q5OQLcQ4vnTrmXfIv1KRGPc+hKAG1h8kYYxyk9aKpu7iAGVJKXox6X83gHWEi7n99O4rNdf2IcJb8b9DuG1pVsK928lHB6+RLj+7uG4/AjhIvaX4v3blphbyoBKklLUPYT7Fp2/f87TOZzbqzng4Xj7X4CDhftPAK/E29vpXLC+Rec1qGeBkxVzKw8pG1BJUoq6AV0PfJew13gF+FMW7z3ejtPADcJe6E3g7sL954Hn4+1VhCjuI7zmtO0oIehlc3dV/WADKklKUTegqwnXwn0YuCfeTnkz7Y8Ih1z/mrAnu6dw/wXgua7P54BnCNFtO0SIZ9ncnVU/2IBKklLUDejfE/bwngEeirev0Nu7sdwfv679JtyngTOFOafoXHN3nM6Zvi3CXibA8TjK5o4BTE9Pn56ammoVx8ADt8zj48+ut2ZnZx0Oh8PR59FD6xZYRdi7+xXhJKLVwJ/T+0lEEzFyX4ifnyO8MTeEPdEJYJJwYhCEv49ejbevAXsJF3K4CuxfYm6plRBQ90Alqf/qBHScEMvf0HlD7VOkvYzlzwkRnY3fr31VoxbwIGHv9Aoh2PPA7nj/JJ3Xns4Qol41t5QBlSSlqBNQCHufxbNwP0r8XmsJZ812O8fC14beB6wpzJkAdpR8v7K5ixhQSVKKugGdAL4N/APwIfAXwL19WK+2J/v4vUoZUElSiroBhXCG63cJh0rvr/vNcjOgkqQUdQP6TTqHcF8l7IW+3of1ysaASpJS1A3oTeBTwok/rwJ/RYjpPfVXLQ8DKklKUSegdxBi+UNgmhDQx+Ky3+nL2mVgQCVJKerugV4nvFzkGmFP9CYLrw409AyoJClF3YA+Blyk83fQFuG1mSPDgEqSUvTjLFwIF5B/mPCylnV0Lq839AyoJClFakC/TLhE3jzhurV7CO/l2b4C0KZ+reByM6CSpBSpAb3CwqsPFceGfq3gcjOgkqQUqQGdB94lXED+PCGa+wmX4xspBlSSlCI1oC3gpXj7ePx8JBlQSVKKOgH9G8IFE35G5+IJ7eFJRMM0DKgk9V2dgC41PIlomIYBlaS+Sw3oG8Avlxjr+rWCy82ASpJS9Ot1oCPLgEqSUhhQAypJSmBADagkKUFKQNcCrxEu3XeacD3ckWVAJUkpUgK6mnCmbftSfv8C/F1heBLRMA0DKkl9l3oI9018GcvoDAMqSX1X52+gawjXxP0G4c21u8fIMKCSpBT9OInoEeBlwruy7AXG637DnAyoJClF3YD+gMWHby/0Yb2yMaCSpBR1ArqW8N6fV4DfAXYBvyJE9Lf7snYZGFBJUoo6Ab2LEMtjXcv2x2VP1FyvbAyoJClF3UO4Nwl7oT8GXgCuxc/X11+1PAyoJClF3YA+BczS+fvnPPBc/dXKx4BKklL04yzcVcBXgH2EiyyMFAMqSUrhtXANqCQpgQE1oJKkBHUDuppwCLe4bGQYUElSitSA3kF4GcsscCjevotwJSJfBzpsw4BKUt+lBvQnLH0x+Yl+reByM6CSpBSpAf19YIrwspWL8XZ7fCtxXVYBW24xZwMwVlg2FpffztxFDKgkKUXdv4F+k/4crt0PfAq8A7wPPFC4fytwGbgEXAcOx+VHCBdvuBTv37bE3FIGVJKUom5Af48QvLnCuLPH7zMH3BtvP0i4tm63E8Ar8fZ2wmHijfHj5rj8LHCyYm7lIWUDKklKUTeg1wmB+hS42jV6+RvoRPwevyYcEv4liwN8Hng+3l4V5+8jnMTUdhR4q2LurqofbkAlSSnqBHScEKeXa67D/fH7/CTevgx8vzDnAgsvETgHPAPc6Fp2iBDPsrk7Aaanp09PTU21imPggVvm8fFn11uzs7MOh8Ph6POoE79fEfZCHyYcgm2P4mtDl7KVENBN8fM/Bt4uzDlF511fxgl7qu091/bPOh5H2dzKk4lWQkDdA5Wk/qsb0DnKX8ayaakvKlhFeFeXZ+Ptc8CfxPv2EEI5SdgzBThIOEwM4QSivfHnXSWcjFQ1t5QBlSSlqBvQY8Bflow7evw+TxP2ZG8C/wjcHZe3CCcVrSe8cfccYY9yd7x/Mn4+D8wQAlw1t5QBlSSlqBvQLYTYFUeKVSVfe45wtm3bfcCawpwJYEfJ9yubu4gBlSSlGIZDuEt5sk/fp5IBlSSlqBvQnwFn4jhHOGR6nd4P4Q6MAZUkpagb0KIjhD3QW12Sb2gYUElSiroB/SbhTNeDhIsXfEAI6M7aa5aJAZUkpViOv4F+RG+vAx0oAypJSlE3oE8DB+L4BvAIsLYP65WNAZUkpagb0I3ADwl7ndcIF3O/pw/rlY0BlSSlqBvQvyYctp0nXAShRTgLd3X9VcvDgEqSUtQJ6AZCMM/Rudbssbjsy/VXLQ8DKklKUSeg7Yu5n+5adjAu+92a65WNAZUkpah7CPcjQjAvAf8Ub3+CZ+EO1zCgktR3dQP6BeBNwt9AW4QLwT/Sh/XKxoBKklLUDeiG+HEtI3T1oW4GVJKUIjWg64D3CW8b1nYReIX0d2MZCAMqSUqRGtC/IhyyfbNr2et0rkQ0VvZFw8iASpJSpAS0ffbtpZL7zsf7Hqi5XtkYUElSipSAfo4Qyb8tue9P433/ueZ6ZWNAJUkpUg/hts+63QusIRyyfRS4gW9nNnzjx+9+6Rsvvnf0wImZl5s6/uDEhfsH/VyStLKkBvQwC9+BZb7rdtme6dBaKQE98OLM/xn4eizjePbF954d9HNJ0spS52UsTwO/IcRznnABhSOM0HVwwYA2ZRhQSbnVfR3oyDOgzRgGVFJuBtSANmIYUEm5GVAD2ohhQCXlZkANaCOGAZWUmwE1oI0YBlRSbgbUgDZiGFBJuRlQA9qIYUAl5WZADWgjhgGVlJsBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5DVtAVwPblrh/A+HNu7uNxeW3M3cRA9qMYUAl5TZsAX0NuFyyfGtcfgm4TnhDbwjvP3otLr9MiG/V3FIGtBnDgErKbZgCOkknhEUngFfi7e1AC9gYP26Oy88CJyvmTlT9UAPajGFAJeU2LAHdRQjnPsoDeh54Pt5eRYjiPmC2a85R4K2KubuqfrABbcYwoJJyG4aArgOuAg8BT1Ae0AvAc12fzwHPADe6lh0ixLNs7k6A6enp01NTU63iGPQv/+UeH392vXXk1X8e+Hos57h45dPW7Oysw+FwZB39CmGq/YS9xE8IQWwBbxbmnAKOxdvjwDzhsGyLsJcJcDyOsrmVJxOthIC6BypJ/TcMAZ0AvhTHQcLe6L3xvj3x/kk6e6btORBOINoLbIrL9i8xt5QBbcYwoJJyG4aAdnuchYdwW8CDwHrgCuFw7DywO94/GT+fB2YIe6NVc0sZ0GYMAyopt2ELaNE5wtm2bfcBawpzJoAdJV9bNncRA9qMYUAl5TbsAX1yuX+AAW3GMKCSchv2gC47A9qMYUAl5WZADWgjhgGVlJsBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5GVAD2ohhQCXlZkANaCOGAZWUmwE1oI0YBlRSbgbUgDZiGFBJuRlQA9qIYUAl5WZADWgjhgGVlJsBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5GVAD2ohhQCXlZkANaCOGAZWUmwE1oI0YBlRSbgbUgDZiGFBJuRlQA9qIYUAl5WZADWgjhgGVlJsBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5GVAD2ohhQCXlNkwBXQ3cdYs5G4CxwrKxuPx25i5iQJsxDKik3IYloD8CbgDvAO8DDxTu3wpcBi4B14HDcfkR4FpcfhnYtsTcUga0GcOASsptGAK6FmjR2Yt8CXijMOcE8Eq8vT3O3xg/bo7LzwInK+ZOVP1wA9qMYUAl5TYMAQXYEj9OAFeB5wv3n+9atooQxX3AbNeco8BbFXN3Vf1gA9qMYUAl5TYsAQX4CuFw7LvAusJ9F4Dnuj6fA54hHPZtO0SIZ9ncnQDT09Onp6amWsUx6F/+yz0+/ux668ir/zzw9VjOcfHKp63Z2VmHw+HIOvpawURfI4TuOxX3nwKOxdvjwDxhb7VF2MsEOB5H2dzKk4lWQkDdA5Wk/huGgK4iRO7xkvv2EEI5STgxCOAg4TAvhD3WvcCmuGz/EnNLGdBmDAMqKbdhCOj9hD3J7vF2vK8FPAisB64Q9lLngd3x/sn4+TwwQ4hx1dxSBrQZw4BKym0YArqUc4SzbdvuA9YU5kwAO0q+tmzuIga0GcOASspt2AP65HL/AAPajGFAJeU27AFddga0GcOASsrNgBrQRgwDKik3A2pAGzEMqKTcDKgBbcQwoJJyM6AGtBHDgErKzYAa0EYMAyopNwNqQBsxDKik3AyoAW3EMKCScjOgBrQRw4BKys2AGtBGDAMqKTcDakAbMQyopNwMqAFtxDCgknIzoAa0EcOASsrNgBrQRoxnX3zv2QMn3vvegRMzLzd1fP3kzJLvbSspLwNqQBsxnn3xvWcPvPjepUGvx3KOr5+c+Z+D/v9FUocBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5GVAD2ohhQCXlZkANaCOGAZWUmwE1oI0YBlRSbgbUgDZiGFBJuRlQA9qIYUAl5WZADWgjhgGVlJsBNaCNGAZUUm4G1IA2YhhQSbkZUAPaiGFAJeVmQA1oI4YBlZSbATWgjRgGVFJuBtSANmIYUEm5NTmgG4CxW00yoM0YBlRSbk0M6FbgMnAJuA4cXmqyAW3GWCkBPXBi5vVBr8fyjpnXM/yOkPqiiQE9AbwSb28HWsBE1WQD2oxhQJsyDKhGRxMDeh54Pt5eRQjorqrJBrQZw4A2Zcy8/vWTM7sPnJh5ubnjve/9wfELdx84MfNUU8fXT87s/k/HL6wf9Hos92hiQC8Az3V9PgfsBJienj49NTXV6h6vvfba/ysuczgcDofjVuP8+fP/OojILadTwLF4exyYZ4mTiaamphr3L4git7EZVsI2wsrYTrexGZq4jZOEk4gADgJXl5rcxP8ARW5jM6yEbYSVsZ1uYzM0cRvXA1cIh27ngd1LTW7if4Ait7EZVsI2wsrYTrexGZq8jfcBa241qcn/AdrcxmZYCdsIK2M73cZmWAnbuKTp6enTg16H5eY2NsNK2EZYGdvpNjbDSthGSZIkSZKGyxjh+rlNVHZd4FHa3k23uH9jybI7gLW3OXcY3Em48EcvRmkbVwN33WJOL8/T27rW9QD8VsLXVG1j1eM7aFtucX8vz8sc27iWJa5Gt4RRe+4NzBHgGuH6uZeBbYNdnSS/AX5NuJDEBcITs+q6wKOyvfcDf0JY9zK/S9iOi/HjVwm/qN8APiK8hOnvCE/2srnD4HPA04Qzxj9Xcv8+4Aadx/V5Rm8bf0TYhneA94EHCvf38jzt6VrXGf0XwjbOEP77Txbuv4PwGLcfx5/H5WXbWPX4DtqXgU8J6/8BnSu+tfXyvMy1jVPAJ8DbhOdeMYi9PC7D+twbqNWEy/1tjp+fBU4ObnWSzRKegONdy8quC7yR0dneE8C7VAf0MrA/3v52/PwJwv8wbTeAvRVzh8G3gTOEx6QsoD8A/ozwPG0bpW1cS9i29i+ulwi/OLv18jzt6VrXGV0h/EMIOv/o6fZlwi/o7sex6ndP1eM7aD8CvhtvP00ITLdenpc5tnFD4Wd8xOJ/2PTyuAzrc2+gvkiIT9tR4K0BrUuqLYQHcx64SfiFC+XXBd7HaG3vI1QH9Cbh5UoAjxFe+/t94O+75rwPHKqYO0yqAvo3hMe1Rfgf/YuM3ja2D/lNEPY2insuvTxPe7rWdUYb6fzj9QwLHx+AA3Qex0+A36f6d0/V4zssfkiI5/8qLO/leZlzG/8D8Fr8ucXDz708LsP63BuoR1j4r8VDhP9Qo+Re4BeEJ8ejhAf285RfF/gZRmt7lwroPLAj3t5F+B/1hyzcw2n/QiqbO0yqAvoC8C3C30h/Sdi2UdzGrxB+6b4LrCvc18vztPJa10Pg3wH/RNjOLxTu+xrwY8KFXg4Ttq3qd0/V4zssjhP25opHOHp5XubcxkcIsZ4nHGrv1svjMszPvYFZT/jl1T6B43gco2Q1Cy8acZHwL6Wy6wJPMFrbu1RAPyT8fQXC31UuAk/Fj20X431lc4dJVUDXd93+KuFfxk8xWtv4NcIvm+9U3N/L87Sna11n9NuEx+Y0i/+BAOFQdnsPdZywbfdTvo1PUf74DtofAffE23cR1v2ervuf4vafl1Vz++nzwB92ff5zFh8Z6OVxGdbn3sBdIxx/30Q4xLR/6elD52lCZNYAdxN+WX2e6usCj9L2FgN6D52TUF4j/G1inLAH/nPCWZDzhL3xfx9vb66YO0y6A3on8B/j7Y/ohOcnhL3QUdrGVYT1e7zkvj2EUPbyPO3pWtcZzRD+RljU3safER47CH//ax8iLNvGqsd30KYI2zFGOCw6R3iuPUj4m2Avz8sc27glruM9cZ3fBf5bvC/lcRnW597ATRIewHnC/wi9vpxg0FYR/oZwg/AEeDEur7ou8ChtbzGgxwj/I0D4H28ujhuEfzxA+B+9/XeNF24xd1i06JwN/VXC+kP4x9FNwjp/ADwUl4/KNrb/Nd893o73tQi/fHt5nvZ0reuM5li4je2/P7e3cQfheXyNcCbrgXh/1f+LZY/voD1E+AfdHOE52f6b5UXgv8bbvTwvc2zjKTrnhnxA52+gKY/LsD73hsIEneP0o+q3WHg2WVvZdYFHdXvvJxwma1tNOHxW/EfAVha/frRq7rCZYOHfpccpf6nRKG8jwDkWvjawl+fpbV3reggUt3F7yZyqbSx7fIfBDhYeuvw+nTOQobfnZY5tXM/ik4fqPC6j8tyTFnmU4TictZy2s/i1kk305KBXIIOVsI2/x+j9LXAlPC6SJEmSJEmSJEmSJEmSJGk5PUHntYvtS6lt61r2l8DDLH4t50XCdUMlSVqRugPavszZN7qWnaIT0I+AaeDNrvtX/LtRSJJWpnZA21eJgXCptZssDuiJrq97h8XXQpUkacVoB/Qf4sethIt/twPZHdA5wl7odTpvryZJ0orUDuj/iB+/HT/+dxYH9BPCxcDfpHOd0J3Z11iSpCHQDugRwrtLXI2fP8bSh3Bfisv+OOfKSpI0LLoDOh1vXye8IXQxoJcI74v4UzrvPvJY/lWWJGnwugP6rXj7DToBrXoZy02G5222JEmSJEmSJEmSJEmSJEmSJEkj7f8DtQUzhUkXlw0AAAAASUVORK5CYII=" }, "metadata": { "jupyter-vega": "#5bcfa0e3-78bd-4117-9996-e18d50e6bb8a" }, "output_type": "display_data" } ], "source": [ "df['mb'] = df['filesize'] / 2**10 / 2**10\n", "alt.Chart(df).mark_bar().encode(\n", " x=alt.X('mb', bin=True, title='MB'),\n", " y='count()',\n", " tooltip='count()'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What's the biggest file available for download?" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "copyright_status No known copyright restrictions\n", "creators Geological Survey of India\n", "date 1932\n", "filesize 3623879488\n", "filesize_string 3.38GB\n", "fulltext_url http://nla.gov.au/nla.obj-591001246\n", "height 38023\n", "title Map of the City of Rangoon and suburbs 1928-29...\n", "trove_id nla.obj-591001246\n", "trove_url https://trove.nla.gov.au/work/182743876\n", "width 31769\n", "mb 3456\n", "Name: 3017, dtype: object" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[df['filesize'].idxmax()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All downloads greater than 3GB." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
copyright_statuscreatorsdatefilesizefilesize_stringfulltext_urlheighttitletrove_idtrove_urlwidthmb
1218Out of CopyrightImray, James F. (James Frederick), 1829?-18911853-186333887488043.16GBhttp://nla.gov.au/nla.obj-39003288924785Chart of the west, south and east coasts of Au...nla.obj-390032889https://trove.nla.gov.au/work/13684619455753231.762699
3017No known copyright restrictionsGeological Survey of India193236238794883.38GBhttp://nla.gov.au/nla.obj-59100124638023Map of the City of Rangoon and suburbs 1928-29...nla.obj-591001246https://trove.nla.gov.au/work/182743876317693456.000793
4578In CopyrightIndonesia. Direktorat Geologi197032792105763.05GBhttp://nla.gov.au/nla.obj-56838710341429Peta geologi teknik daerah Jakarta - Bogor : E...nla.obj-568387103https://trove.nla.gov.au/work/20208553263843127.298904
4830No known copyright restrictionsTaiwan194232644565003.04GBhttp://nla.gov.au/nla.obj-40082663825508Nyūginia-tō zenzu / Taiwan Sōtokufu Gaijibu...nla.obj-400826638https://trove.nla.gov.au/work/205481810426593113.228321
7237In CopyrightIndonesia. Direktorat Geologi196333118016003.08GBhttp://nla.gov.au/nla.obj-56838709920990Geological map of Djawa and Madura / compiled ...nla.obj-568387099https://trove.nla.gov.au/work/218208895525933158.380127
19858Out of CopyrightSouth Australia. Surveyor-General's Office1885-195033086082883.08GBhttp://nla.gov.au/nla.obj-23070506743121Plan shewing pastoral leases and claims in the...nla.obj-230705067https://trove.nla.gov.au/work/8818311255763155.334747
\n", "
" ], "text/plain": [ " copyright_status \\\n", "1218 Out of Copyright \n", "3017 No known copyright restrictions \n", "4578 In Copyright \n", "4830 No known copyright restrictions \n", "7237 In Copyright \n", "19858 Out of Copyright \n", "\n", " creators date filesize \\\n", "1218 Imray, James F. (James Frederick), 1829?-1891 1853-1863 3388748804 \n", "3017 Geological Survey of India 1932 3623879488 \n", "4578 Indonesia. Direktorat Geologi 1970 3279210576 \n", "4830 Taiwan 1942 3264456500 \n", "7237 Indonesia. Direktorat Geologi 1963 3311801600 \n", "19858 South Australia. Surveyor-General's Office 1885-1950 3308608288 \n", "\n", " filesize_string fulltext_url height \\\n", "1218 3.16GB http://nla.gov.au/nla.obj-390032889 24785 \n", "3017 3.38GB http://nla.gov.au/nla.obj-591001246 38023 \n", "4578 3.05GB http://nla.gov.au/nla.obj-568387103 41429 \n", "4830 3.04GB http://nla.gov.au/nla.obj-400826638 25508 \n", "7237 3.08GB http://nla.gov.au/nla.obj-568387099 20990 \n", "19858 3.08GB http://nla.gov.au/nla.obj-230705067 43121 \n", "\n", " title trove_id \\\n", "1218 Chart of the west, south and east coasts of Au... nla.obj-390032889 \n", "3017 Map of the City of Rangoon and suburbs 1928-29... nla.obj-591001246 \n", "4578 Peta geologi teknik daerah Jakarta - Bogor : E... nla.obj-568387103 \n", "4830 Nyūginia-tō zenzu / Taiwan Sōtokufu Gaijibu... nla.obj-400826638 \n", "7237 Geological map of Djawa and Madura / compiled ... nla.obj-568387099 \n", "19858 Plan shewing pastoral leases and claims in the... nla.obj-230705067 \n", "\n", " trove_url width mb \n", "1218 https://trove.nla.gov.au/work/13684619 45575 3231.762699 \n", "3017 https://trove.nla.gov.au/work/182743876 31769 3456.000793 \n", "4578 https://trove.nla.gov.au/work/20208553 26384 3127.298904 \n", "4830 https://trove.nla.gov.au/work/205481810 42659 3113.228321 \n", "7237 https://trove.nla.gov.au/work/218208895 52593 3158.380127 \n", "19858 https://trove.nla.gov.au/work/8818311 25576 3155.334747 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[(df['filesize'] / 2**10 / 2**10 / 2**10) > 3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The widest image?" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "copyright_status In Copyright\n", "creators Brunei Shell Petroleum Company\n", "date 1968\n", "filesize 3008938460\n", "filesize_string 2.80GB\n", "fulltext_url http://nla.gov.au/nla.obj-636346192\n", "height 14652\n", "title Land status petroleum mining agreement in resp...\n", "trove_id nla.obj-636346192\n", "trove_url https://trove.nla.gov.au/work/230363372\n", "width 68453\n", "mb 2869.55\n", "Name: 8165, dtype: object" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[df['width'].idxmax()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The tallest image?" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "copyright_status Out of Copyright\n", "creators South Australia. Surveyor-General's Office\n", "date 1885-1950\n", "filesize 3308608288\n", "filesize_string 3.08GB\n", "fulltext_url http://nla.gov.au/nla.obj-230705067\n", "height 43121\n", "title Plan shewing pastoral leases and claims in the...\n", "trove_id nla.obj-230705067\n", "trove_url https://trove.nla.gov.au/work/8818311\n", "width 25576\n", "mb 3155.33\n", "Name: 19858, dtype: object" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[df['height'].idxmax()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "Created by [Tim Sherratt](https://timsherratt.org/).\n", "\n", "Work on this notebook was supported by the [Humanities, Arts and Social Sciences (HASS) Data Enhanced Virtual Lab](https://tinker.edu.au/).\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }