{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Mapping Te Papa’s collections\n", "\n", "This notebook creates some simple maps using the `production.spatial` facet of the Te Papa API to identify places where collection objects were created." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!.

\n", "\n", "

\n", " Some tips:\n", "

\n", "

\n", "
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RendererRegistry.enable('notebook')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import requests\n", "import pandas as pd\n", "import altair as alt\n", "import re\n", "import folium\n", "from tqdm import tnrange\n", "from folium.plugins import MarkerCluster\n", "from IPython.display import display, HTML\n", "alt.renderers.enable('notebook')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get an API key\n", "[Sign up here](https://data.tepapa.govt.nz/docs/register.html) for your very own API key." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Your API key is: \n" ] } ], "source": [ "# Insert your API key between the quotes\n", "api_key = ''\n", "# If you don't have an API key yet, you can leave the above blank and we'll pick up a guest token below\n", "print('Your API key is: {}'.format(api_key))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set some parameters" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "search_endpoint = 'https://data.tepapa.govt.nz/collection/search'\n", "\n", "headers = {\n", " 'x-api-key': api_key,\n", " 'Accept': 'application/json'\n", "}\n", "\n", "if not api_key:\n", " response = requests.get('https://data.tepapa.govt.nz/collection/search')\n", " data = response.json()\n", " guest_token = data['guestToken']\n", " headers['Authorization'] = 'Bearer {}'.format(guest_token)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we set the search parameters. Currently it will return information about all objects in the collection. You can change the `query` value to limit the result set — try replacing the asterix with some keywords.\n", "\n", "The `size` parameter sets the number of places to return — so in this case we're getting the 100 places that have the most objects associated with them.\n", "\n", "The `production.spatial.href` facet gives us the API url of the place itself, so we can use it to get more information about the place." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "post_data = {\n", " 'query': '*',\n", " 'filters': [{\n", " 'field': 'type',\n", " 'keyword': 'Object'\n", " }],\n", " 'facets': [\n", " {'field': 'production.spatial.href',\n", " 'size': 100}\n", " ]\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Get some data" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Make the API request\n", "response = requests.post(search_endpoint, json=post_data, headers=headers)\n", "data = response.json()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcount
0https://data.tepapa.govt.nz/collection/place/2...395
1https://data.tepapa.govt.nz/collection/place/2...8846
2https://data.tepapa.govt.nz/collection/place/2...430
3https://data.tepapa.govt.nz/collection/place/2...426
4https://data.tepapa.govt.nz/collection/place/2...374
\n", "
" ], "text/plain": [ " place_id count\n", "0 https://data.tepapa.govt.nz/collection/place/2... 395\n", "1 https://data.tepapa.govt.nz/collection/place/2... 8846\n", "2 https://data.tepapa.govt.nz/collection/place/2... 430\n", "3 https://data.tepapa.govt.nz/collection/place/2... 426\n", "4 https://data.tepapa.govt.nz/collection/place/2... 374" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Convert the facets data to a dataframe and do some cleaning up\n", "# We end up with two columns -- one with the place url, and the other with the number of objects associated with that place\n", "places_df = pd.DataFrame(list(data['facets']['production.spatial.href'].items()))\n", "places_df.columns = ['place_id', 'count']\n", "places_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Add more information about each place\n", "\n", "Using the place url we'll get the full record for each place. We'll then save the name of the place, its geospatial coordinates (if any), and its ISO country code (if any) to the dataframe." ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "de7b3dc3f4d440788c48a03b62ad8b37", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(IntProgress(value=0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcounttitlelatlonisocode
0https://data.tepapa.govt.nz/collection/place/2...395Surrey (United Kingdom)51.200-0.050NaN
1https://data.tepapa.govt.nz/collection/place/2...8845Auckland (New Zealand)-36.917174.783NaN
2https://data.tepapa.govt.nz/collection/place/2...430Kanto (Nihon)36.250139.500NaN
3https://data.tepapa.govt.nz/collection/place/2...426Solomon IslandsNaNNaNNaN
4https://data.tepapa.govt.nz/collection/place/2...374Napier (New Zealand)-39.483176.967NaN
\n", "
" ], "text/plain": [ " place_id count \\\n", "0 https://data.tepapa.govt.nz/collection/place/2... 395 \n", "1 https://data.tepapa.govt.nz/collection/place/2... 8845 \n", "2 https://data.tepapa.govt.nz/collection/place/2... 430 \n", "3 https://data.tepapa.govt.nz/collection/place/2... 426 \n", "4 https://data.tepapa.govt.nz/collection/place/2... 374 \n", "\n", " title lat lon isocode \n", "0 Surrey (United Kingdom) 51.200 -0.050 NaN \n", "1 Auckland (New Zealand) -36.917 174.783 NaN \n", "2 Kanto (Nihon) 36.250 139.500 NaN \n", "3 Solomon Islands NaN NaN NaN \n", "4 Napier (New Zealand) -39.483 176.967 NaN " ] }, "execution_count": 126, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def find_country_code(place):\n", " code = None\n", " if 'alternativeTerms' in place:\n", " for term in place['alternativeTerms']:\n", " try:\n", " if term[:3] == 'ISO':\n", " code = term[3:]\n", " except TypeError:\n", " pass\n", " return code \n", "\n", "for i in tnrange(len(places_df)):\n", " href = places_df.loc[i]['place_id']\n", " response = requests.get(href, headers=headers)\n", " place_data = response.json()\n", " places_df.at[i, 'title'] = place_data['title']\n", " code = find_country_code(place_data)\n", " if code:\n", " places_df.at[i, 'isocode'] = code\n", " if 'geoLocation' in place_data:\n", " places_df.at[i, 'lat'] = place_data['geoLocation']['lat']\n", " places_df.at[i, 'lon'] = place_data['geoLocation']['lon']\n", "\n", "places_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make a map" ] }, { "cell_type": "code", "execution_count": 130, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 130, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import html\n", "m = folium.Map(\n", " location=[10, 10],\n", " zoom_start=1.5\n", ")\n", "# We'll cluster the markers for better readability\n", "marker_cluster = MarkerCluster().add_to(m)\n", "\n", "for index, row in places_df.dropna(subset=['lat', 'lon']).iterrows():\n", " # We can easily change the API url to a web url and use it to link the map to the Te Papa collection web site\n", " web_url = row['place_id'].replace('/collection/', '/').replace('data', 'collections')\n", " popup = '{}
{} objects'.format(web_url, html.escape(row['title']), row['count'])\n", " folium.Marker([row['lat'], row['lon']], popup=popup).add_to(marker_cluster)\n", " \n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make another map\n", "\n", "Let's try and make the **number** of objects created in each place more obvious." ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 105, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import html\n", "m = folium.Map(\n", " location=[10, 10],\n", " zoom_start=1.5\n", ")\n", "\n", "for index, row in places_df.dropna(subset=['lat', 'lon']).iterrows():\n", " popup = '{}
{} objects'.format(html.escape(row['title']), row['count'])\n", " folium.Circle([row['lat'], row['lon']], radius=row['count']*5, popup=popup, color='#de2d26', fill=True).add_to(m)\n", " \n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What's missing?\n", "\n", "Remember that we're not seeing **all** the places where objects were created. First of all the facet `size` parameter limited out results to the top 100 places. Trying changing it to see what happens.\n", "\n", "Even amongst the top 100, not every place had geospatial coordinates attached to it. So not everything is on the map. Let's create a list of places without coordinates." ] }, { "cell_type": "code", "execution_count": 132, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
place_idcounttitlelatlonisocode
3https://data.tepapa.govt.nz/collection/place/2...426Solomon IslandsNaNNaNNaN
17https://data.tepapa.govt.nz/collection/place/2...592Upolu (Samoa)NaNNaNNaN
31https://data.tepapa.govt.nz/collection/place/314795OpononiNaNNaNNaN
33https://data.tepapa.govt.nz/collection/place/2...668Chatham Islands (New Zealand)NaNNaNNaN
46https://data.tepapa.govt.nz/collection/place/2...435Jawa (Indonesia)NaNNaNNaN
47https://data.tepapa.govt.nz/collection/place/2...5171North Island (New Zealand)NaNNaNNaN
59https://data.tepapa.govt.nz/collection/place/2...613South Island (New Zealand)NaNNaNNaN
65https://data.tepapa.govt.nz/collection/place/2...431AfricaNaNNaNNaN
70https://data.tepapa.govt.nz/collection/place/2...1117Stewart Island (New Zealand)NaNNaNNaN
71https://data.tepapa.govt.nz/collection/place/2...287Pacific IslandsNaNNaNNaN
78https://data.tepapa.govt.nz/collection/place/2...475CzechoslovakiaNaNNaNNaN
94https://data.tepapa.govt.nz/collection/place/2...303Admiralty Islands (Papua New Guinea)NaNNaNNaN
\n", "
" ], "text/plain": [ " place_id count \\\n", "3 https://data.tepapa.govt.nz/collection/place/2... 426 \n", "17 https://data.tepapa.govt.nz/collection/place/2... 592 \n", "31 https://data.tepapa.govt.nz/collection/place/314 795 \n", "33 https://data.tepapa.govt.nz/collection/place/2... 668 \n", "46 https://data.tepapa.govt.nz/collection/place/2... 435 \n", "47 https://data.tepapa.govt.nz/collection/place/2... 5171 \n", "59 https://data.tepapa.govt.nz/collection/place/2... 613 \n", "65 https://data.tepapa.govt.nz/collection/place/2... 431 \n", "70 https://data.tepapa.govt.nz/collection/place/2... 1117 \n", "71 https://data.tepapa.govt.nz/collection/place/2... 287 \n", "78 https://data.tepapa.govt.nz/collection/place/2... 475 \n", "94 https://data.tepapa.govt.nz/collection/place/2... 303 \n", "\n", " title lat lon isocode \n", "3 Solomon Islands NaN NaN NaN \n", "17 Upolu (Samoa) NaN NaN NaN \n", "31 Opononi NaN NaN NaN \n", "33 Chatham Islands (New Zealand) NaN NaN NaN \n", "46 Jawa (Indonesia) NaN NaN NaN \n", "47 North Island (New Zealand) NaN NaN NaN \n", "59 South Island (New Zealand) NaN NaN NaN \n", "65 Africa NaN NaN NaN \n", "70 Stewart Island (New Zealand) NaN NaN NaN \n", "71 Pacific Islands NaN NaN NaN \n", "78 Czechoslovakia NaN NaN NaN \n", "94 Admiralty Islands (Papua New Guinea) NaN NaN NaN " ] }, "execution_count": 132, "metadata": {}, "output_type": "execute_result" } ], "source": [ "places_df.loc[places_df['lat'].isnull()] " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }