{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Mapping Te Papa’s collections\n", "\n", "This notebook creates some simple maps using the `production.spatial` facet of the Te Papa API to identify places where collection objects were created." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!.

\n", "\n", "

\n", " Some tips:\n", "

\n", "

\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import pandas as pd\n", "import altair as alt\n", "import re\n", "import folium\n", "from tqdm.auto import tqdm\n", "from folium.plugins import MarkerCluster\n", "from IPython.display import display, HTML" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get an API key\n", "[Sign up here](https://data.tepapa.govt.nz/docs/register.html) for your very own API key." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Insert your API key between the quotes\n", "api_key = 'YOUR API KEY'\n", "# If you don't have an API key yet, you can leave the above blank and we'll pick up a guest token below\n", "print('Your API key is: {}'.format(api_key))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set some parameters" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "search_endpoint = 'https://data.tepapa.govt.nz/collection/search'\n", "\n", "headers = {\n", " 'x-api-key': api_key,\n", " 'Accept': 'application/json'\n", "}\n", "\n", "if not api_key:\n", " response = requests.get('https://data.tepapa.govt.nz/collection/search')\n", " data = response.json()\n", " guest_token = data['guestToken']\n", " headers['Authorization'] = 'Bearer {}'.format(guest_token)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we set the search parameters. Currently it will return information about all objects in the collection. You can change the `query` value to limit the result set — try replacing the asterix with some keywords.\n", "\n", "The `size` parameter sets the number of places to return — so in this case we're getting the 100 places that have the most objects associated with them.\n", "\n", "The `production.spatial.href` facet gives us the API url of the place itself, so we can use it to get more information about the place." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "post_data = {\n", " 'query': '*',\n", " 'filters': [{\n", " 'field': 'type',\n", " 'keyword': 'Object'\n", " }],\n", " 'facets': [\n", " {'field': 'production.spatial.href',\n", " 'size': 100}\n", " ]\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get some data" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Make the API request\n", "response = requests.post(search_endpoint, json=post_data, headers=headers)\n", "data = response.json()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcount
0https://data.tepapa.govt.nz/collection/place/2...536
1https://data.tepapa.govt.nz/collection/place/2...9698
2https://data.tepapa.govt.nz/collection/place/2...430
3https://data.tepapa.govt.nz/collection/place/2...554
4https://data.tepapa.govt.nz/collection/place/2...466
\n", "
" ], "text/plain": [ " place_id count\n", "0 https://data.tepapa.govt.nz/collection/place/2... 536\n", "1 https://data.tepapa.govt.nz/collection/place/2... 9698\n", "2 https://data.tepapa.govt.nz/collection/place/2... 430\n", "3 https://data.tepapa.govt.nz/collection/place/2... 554\n", "4 https://data.tepapa.govt.nz/collection/place/2... 466" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Convert the facets data to a dataframe and do some cleaning up\n", "# We end up with two columns -- one with the place url, and the other with the number of objects associated with that place\n", "places_df = pd.DataFrame(list(data['facets']['production.spatial.href'].items()))\n", "places_df.columns = ['place_id', 'count']\n", "places_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Add more information about each place\n", "\n", "Using the place url we'll get the full record for each place. We'll then save the name of the place, its geospatial coordinates (if any), and its ISO country code (if any) to the dataframe." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f769cf31b9444dbbb24fd3b3df7cba38", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/100 [00:00\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcounttitlelatlonisocode
0https://data.tepapa.govt.nz/collection/place/2...536Surrey (United Kingdom)51.200-0.050None
1https://data.tepapa.govt.nz/collection/place/2...9698Auckland (New Zealand)-36.917174.783None
2https://data.tepapa.govt.nz/collection/place/2...430Kanto (Nihon)36.250139.500None
3https://data.tepapa.govt.nz/collection/place/2...554D'Urville Island (New Zealand)NaNNaNNone
4https://data.tepapa.govt.nz/collection/place/2...466Solomon IslandsNaNNaNNone
\n", "" ], "text/plain": [ " place_id count \\\n", "0 https://data.tepapa.govt.nz/collection/place/2... 536 \n", "1 https://data.tepapa.govt.nz/collection/place/2... 9698 \n", "2 https://data.tepapa.govt.nz/collection/place/2... 430 \n", "3 https://data.tepapa.govt.nz/collection/place/2... 554 \n", "4 https://data.tepapa.govt.nz/collection/place/2... 466 \n", "\n", " title lat lon isocode \n", "0 Surrey (United Kingdom) 51.200 -0.050 None \n", "1 Auckland (New Zealand) -36.917 174.783 None \n", "2 Kanto (Nihon) 36.250 139.500 None \n", "3 D'Urville Island (New Zealand) NaN NaN None \n", "4 Solomon Islands NaN NaN None " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tqdm.pandas()\n", "\n", "def find_country_code(place):\n", " code = None\n", " if 'alternativeTerms' in place:\n", " for term in place['alternativeTerms']:\n", " try:\n", " if term[:3] == 'ISO':\n", " code = term[3:]\n", " except TypeError:\n", " pass\n", " return code \n", "\n", "def add_place_info(place_id):\n", " response = requests.get(place_id, headers=headers)\n", " place_data = response.json()\n", " code = find_country_code(place_data)\n", " if 'geoLocation' in place_data:\n", " lat = place_data['geoLocation']['lat']\n", " lon = place_data['geoLocation']['lon']\n", " else:\n", " lat = None\n", " lon = None\n", " return pd.Series([place_data['title'], lat, lon, code])\n", "\n", "places_df[['title', 'lat', 'lon', 'isocode']] = places_df['place_id'].progress_apply(add_place_info)\n", "places_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make a map" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import html\n", "m = folium.Map(\n", " location=[10, 10],\n", " zoom_start=1.5\n", ")\n", "# We'll cluster the markers for better readability\n", "marker_cluster = MarkerCluster().add_to(m)\n", "\n", "for index, row in places_df.dropna(subset=['lat', 'lon']).iterrows():\n", " # We can easily change the API url to a web url and use it to link the map to the Te Papa collection web site\n", " web_url = row['place_id'].replace('/collection/', '/').replace('data', 'collections')\n", " popup = '{}
{} objects'.format(web_url, html.escape(row['title']), row['count'])\n", " folium.Marker([row['lat'], row['lon']], popup=popup).add_to(marker_cluster)\n", " \n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make another map\n", "\n", "Let's try and make the **number** of objects created in each place more obvious." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Make this Notebook Trusted to load map: File -> Trust Notebook
" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import html\n", "m = folium.Map(\n", " location=[10, 10],\n", " zoom_start=1.5\n", ")\n", "\n", "for index, row in places_df.dropna(subset=['lat', 'lon']).iterrows():\n", " popup = '{}
{} objects'.format(html.escape(row['title']), row['count'])\n", " folium.Circle([row['lat'], row['lon']], radius=row['count']*5, popup=popup, color='#de2d26', fill=True).add_to(m)\n", " \n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What's missing?\n", "\n", "Remember that we're not seeing **all** the places where objects were created. First of all the facet `size` parameter limited out results to the top 100 places. Trying changing it to see what happens.\n", "\n", "Even amongst the top 100, not every place had geospatial coordinates attached to it. So not everything is on the map. Let's create a list of places without coordinates." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcounttitlelatlonisocode
3https://data.tepapa.govt.nz/collection/place/2...554D'Urville Island (New Zealand)NaNNaNNone
4https://data.tepapa.govt.nz/collection/place/2...466Solomon IslandsNaNNaNNone
34https://data.tepapa.govt.nz/collection/place/2...682Chatham Islands (New Zealand)NaNNaNNone
45https://data.tepapa.govt.nz/collection/place/2...435Jawa (Indonesia)NaNNaNNone
46https://data.tepapa.govt.nz/collection/place/2...5228North Island (New Zealand)NaNNaNNone
58https://data.tepapa.govt.nz/collection/place/2...769South Island (New Zealand)NaNNaNNone
64https://data.tepapa.govt.nz/collection/place/2...409AfricaNaNNaNNone
70https://data.tepapa.govt.nz/collection/place/2...1147Stewart Island (New Zealand)NaNNaNNone
71https://data.tepapa.govt.nz/collection/place/2...285Pacific IslandsNaNNaNNone
77https://data.tepapa.govt.nz/collection/place/2...433CzechoslovakiaNaNNaNNone
86https://data.tepapa.govt.nz/collection/place/2...431Tararua Range (New Zealand)NaNNaNNone
94https://data.tepapa.govt.nz/collection/place/2...303Admiralty Islands (Papua New Guinea)NaNNaNNone
\n", "
" ], "text/plain": [ " place_id count \\\n", "3 https://data.tepapa.govt.nz/collection/place/2... 554 \n", "4 https://data.tepapa.govt.nz/collection/place/2... 466 \n", "34 https://data.tepapa.govt.nz/collection/place/2... 682 \n", "45 https://data.tepapa.govt.nz/collection/place/2... 435 \n", "46 https://data.tepapa.govt.nz/collection/place/2... 5228 \n", "58 https://data.tepapa.govt.nz/collection/place/2... 769 \n", "64 https://data.tepapa.govt.nz/collection/place/2... 409 \n", "70 https://data.tepapa.govt.nz/collection/place/2... 1147 \n", "71 https://data.tepapa.govt.nz/collection/place/2... 285 \n", "77 https://data.tepapa.govt.nz/collection/place/2... 433 \n", "86 https://data.tepapa.govt.nz/collection/place/2... 431 \n", "94 https://data.tepapa.govt.nz/collection/place/2... 303 \n", "\n", " title lat lon isocode \n", "3 D'Urville Island (New Zealand) NaN NaN None \n", "4 Solomon Islands NaN NaN None \n", "34 Chatham Islands (New Zealand) NaN NaN None \n", "45 Jawa (Indonesia) NaN NaN None \n", "46 North Island (New Zealand) NaN NaN None \n", "58 South Island (New Zealand) NaN NaN None \n", "64 Africa NaN NaN None \n", "70 Stewart Island (New Zealand) NaN NaN None \n", "71 Pacific Islands NaN NaN None \n", "77 Czechoslovakia NaN NaN None \n", "86 Tararua Range (New Zealand) NaN NaN None \n", "94 Admiralty Islands (Papua New Guinea) NaN NaN None " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "places_df.loc[places_df['lat'].isnull()] " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "Created by [Tim Sherratt](https://timsherratt.org/) for the [GLAM Workbench](https://glam-workbench.net/). Support this project by becoming a [GitHub sponsor](https://github.com/sponsors/wragge?o=esb)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }