{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Find how many specimens of each species are in the Museums Victoria collection\n",
"\n",
"In [another notebook](museumvic-get-a-list-of-species.ipynb) we harvested a list of species from the Museum of Victoria using their collection API and saved the results as a CSV file.\n",
"\n",
"Here we'll search for specimens matching each of the species and save the total number of records.\n",
"\n",
"We'll use these search parameters:\n",
"\n",
"* `recordtype` which we'll set to 'specimen'\n",
"* `taxon` which we'll set the the species' taxon name"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import what we need"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"from tqdm.auto import tqdm\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"SEARCH_URL = 'https://collections.museumsvictoria.com.au/api/search'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Load the CSV file containing the list of species."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" id | \n",
" taxon_name | \n",
" common_name | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" species/8583 | \n",
" Melangyna viridiceps | \n",
" Common Hover Fly | \n",
"
\n",
" \n",
" 1 | \n",
" species/8307 | \n",
" Tetractenos glaber | \n",
" Smooth Toadfish | \n",
"
\n",
" \n",
" 2 | \n",
" species/8815 | \n",
" Salticidae | \n",
" Jumping Spider | \n",
"
\n",
" \n",
" 3 | \n",
" species/8456 | \n",
" Hydromys chrysogaster | \n",
" Common Water Rat | \n",
"
\n",
" \n",
" 4 | \n",
" species/12377 | \n",
" Dromaius novaehollandiae | \n",
" Emu | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" id taxon_name common_name\n",
"0 species/8583 Melangyna viridiceps Common Hover Fly\n",
"1 species/8307 Tetractenos glaber Smooth Toadfish\n",
"2 species/8815 Salticidae Jumping Spider\n",
"3 species/8456 Hydromys chrysogaster Common Water Rat\n",
"4 species/12377 Dromaius novaehollandiae Emu"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_species = pd.read_csv('museum-victoria-species.csv')\n",
"df_species.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define some functions"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def get_totals(params):\n",
" '''\n",
" Get the total number of results and pages returned by a search.\n",
" '''\n",
" response = requests.get(SEARCH_URL, params=params, headers={'User-Agent': 'Mozilla/5.0'})\n",
" # The total results and pages values are in the API response's headers!\n",
" total_results = int(response.headers['Total-Results'])\n",
" total_pages = int(response.headers['Total-Pages'])\n",
" return (total_results, total_pages)\n",
"\n",
"def get_specimen_totals(species):\n",
" '''\n",
" Find the number of specimens matching each species.\n",
" '''\n",
" params = {\n",
" 'recordtype': 'specimen'\n",
" }\n",
" total_specimens = []\n",
" for s in tqdm(species):\n",
" params['taxon'] = s['taxon_name']\n",
" total_results, _ = get_totals(params)\n",
" s['total_specimens'] = total_results\n",
" total_specimens.append(s)\n",
" return total_specimens"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download the data!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"specimens = get_specimen_totals(df_species.to_dict('records'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Convert to a dataframe"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"df_specimens = pd.DataFrame(specimens)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show the top twenty specimens by species!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" id | \n",
" taxon_name | \n",
" common_name | \n",
" total_specimens | \n",
"
\n",
" \n",
" \n",
" \n",
" 211 | \n",
" species/8463 | \n",
" Amphipoda | \n",
" Amphipod | \n",
" 20655 | \n",
"
\n",
" \n",
" 1184 | \n",
" species/8483 | \n",
" Leptoceridae | \n",
" Caddisfly | \n",
" 16639 | \n",
"
\n",
" \n",
" 1072 | \n",
" species/8494 | \n",
" Leptoceridae | \n",
" Caddisfly larva | \n",
" 16639 | \n",
"
\n",
" \n",
" 1103 | \n",
" species/15127 | \n",
" Chrysomelidae | \n",
" Eucalyptus Leaf Beetle | \n",
" 11534 | \n",
"
\n",
" \n",
" 204 | \n",
" species/8532 | \n",
" Castiarina | \n",
" Jewel Beetle | \n",
" 9626 | \n",
"
\n",
" \n",
" 208 | \n",
" species/8480 | \n",
" Hydropsychidae | \n",
" Caddisfly | \n",
" 8340 | \n",
"
\n",
" \n",
" 1079 | \n",
" species/8492 | \n",
" Hydropsychidae | \n",
" Caddisfly larva | \n",
" 8340 | \n",
"
\n",
" \n",
" 459 | \n",
" species/15892 | \n",
" Ophiurida | \n",
" Brittle Star | \n",
" 8318 | \n",
"
\n",
" \n",
" 226 | \n",
" species/8360 | \n",
" Litoria ewingii | \n",
" Brown Tree Frog | \n",
" 6040 | \n",
"
\n",
" \n",
" 1196 | \n",
" species/8468 | \n",
" Ostracoda | \n",
" Seed Shrimp | \n",
" 5925 | \n",
"
\n",
" \n",
" 92 | \n",
" species/8341 | \n",
" Crinia signifera | \n",
" Common Eastern Froglet | \n",
" 5666 | \n",
"
\n",
" \n",
" 1398 | \n",
" species/15125 | \n",
" Ichneumonidae | \n",
" NaN | \n",
" 5404 | \n",
"
\n",
" \n",
" 243 | \n",
" species/8395 | \n",
" Eulamprus | \n",
" Water Skink | \n",
" 5081 | \n",
"
\n",
" \n",
" 213 | \n",
" species/15891 | \n",
" Holothuroidea | \n",
" NaN | \n",
" 4858 | \n",
"
\n",
" \n",
" 101 | \n",
" species/15886 | \n",
" Anomura | \n",
" NaN | \n",
" 3427 | \n",
"
\n",
" \n",
" 28 | \n",
" species/8365 | \n",
" Litoria raniformis | \n",
" Southern Bell Frog | \n",
" 3029 | \n",
"
\n",
" \n",
" 1186 | \n",
" species/8509 | \n",
" Planorbidae | \n",
" Freshwater Snail | \n",
" 3000 | \n",
"
\n",
" \n",
" 1221 | \n",
" species/8425 | \n",
" Antechinus agilis | \n",
" Agile Antechinus | \n",
" 2966 | \n",
"
\n",
" \n",
" 255 | \n",
" species/8396 | \n",
" Lampropholis | \n",
" Garden Skink | \n",
" 2962 | \n",
"
\n",
" \n",
" 615 | \n",
" species/8619 | \n",
" Zoantharia | \n",
" Zoanthid | \n",
" 2786 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" id taxon_name common_name \\\n",
"211 species/8463 Amphipoda Amphipod \n",
"1184 species/8483 Leptoceridae Caddisfly \n",
"1072 species/8494 Leptoceridae Caddisfly larva \n",
"1103 species/15127 Chrysomelidae Eucalyptus Leaf Beetle \n",
"204 species/8532 Castiarina Jewel Beetle \n",
"208 species/8480 Hydropsychidae Caddisfly \n",
"1079 species/8492 Hydropsychidae Caddisfly larva \n",
"459 species/15892 Ophiurida Brittle Star \n",
"226 species/8360 Litoria ewingii Brown Tree Frog \n",
"1196 species/8468 Ostracoda Seed Shrimp \n",
"92 species/8341 Crinia signifera Common Eastern Froglet \n",
"1398 species/15125 Ichneumonidae NaN \n",
"243 species/8395 Eulamprus Water Skink \n",
"213 species/15891 Holothuroidea NaN \n",
"101 species/15886 Anomura NaN \n",
"28 species/8365 Litoria raniformis Southern Bell Frog \n",
"1186 species/8509 Planorbidae Freshwater Snail \n",
"1221 species/8425 Antechinus agilis Agile Antechinus \n",
"255 species/8396 Lampropholis Garden Skink \n",
"615 species/8619 Zoantharia Zoanthid \n",
"\n",
" total_specimens \n",
"211 20655 \n",
"1184 16639 \n",
"1072 16639 \n",
"1103 11534 \n",
"204 9626 \n",
"208 8340 \n",
"1079 8340 \n",
"459 8318 \n",
"226 6040 \n",
"1196 5925 \n",
"92 5666 \n",
"1398 5404 \n",
"243 5081 \n",
"213 4858 \n",
"101 3427 \n",
"28 3029 \n",
"1186 3000 \n",
"1221 2966 \n",
"255 2962 \n",
"615 2786 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Sort the dataframe by total_results then show a slice of the first 20 records\n",
"df_specimens.sort_values(by='total_specimens', ascending=False)[:20]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What next?\n",
"\n",
"* How might you visualise these results?\n",
"* Could we include other taxonomic data to group the species?\n",
"* How could we get an image of each species (selected at random from matching specimens)? "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"----\n",
"\n",
"Created by [Tim Sherratt](https://timsherratt.org/) for the [GLAM Workbench](https://glam-workbench.github.io/). Support me by becoming a [GitHub sponsor](https://github.com/sponsors/wragge)!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}