{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Find how many specimens of each species are in the Museums Victoria collection\n", "\n", "In [another notebook](museumvic-get-a-list-of-species.ipynb) we harvested a list of species from the Museum of Victoria using their collection API and saved the results as a CSV file.\n", "\n", "Here we'll search for specimens matching each of the species and save the total number of records.\n", "\n", "We'll use these search parameters:\n", "\n", "* `recordtype` which we'll set to 'specimen'\n", "* `taxon` which we'll set the the species' taxon name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import what we need" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import requests\n", "from tqdm.auto import tqdm\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "SEARCH_URL = 'https://collections.museumsvictoria.com.au/api/search'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the CSV file containing the list of species." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtaxon_namecommon_name
0species/8583Melangyna viridicepsCommon Hover Fly
1species/8307Tetractenos glaberSmooth Toadfish
2species/8815SalticidaeJumping Spider
3species/8456Hydromys chrysogasterCommon Water Rat
4species/12377Dromaius novaehollandiaeEmu
\n", "
" ], "text/plain": [ " id taxon_name common_name\n", "0 species/8583 Melangyna viridiceps Common Hover Fly\n", "1 species/8307 Tetractenos glaber Smooth Toadfish\n", "2 species/8815 Salticidae Jumping Spider\n", "3 species/8456 Hydromys chrysogaster Common Water Rat\n", "4 species/12377 Dromaius novaehollandiae Emu" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_species = pd.read_csv('museum-victoria-species.csv')\n", "df_species.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define some functions" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def get_totals(params):\n", " '''\n", " Get the total number of results and pages returned by a search.\n", " '''\n", " response = requests.get(SEARCH_URL, params=params, headers={'User-Agent': 'Mozilla/5.0'})\n", " # The total results and pages values are in the API response's headers!\n", " total_results = int(response.headers['Total-Results'])\n", " total_pages = int(response.headers['Total-Pages'])\n", " return (total_results, total_pages)\n", "\n", "def get_specimen_totals(species):\n", " '''\n", " Find the number of specimens matching each species.\n", " '''\n", " params = {\n", " 'recordtype': 'specimen'\n", " }\n", " total_specimens = []\n", " for s in tqdm(species):\n", " params['taxon'] = s['taxon_name']\n", " total_results, _ = get_totals(params)\n", " s['total_specimens'] = total_results\n", " total_specimens.append(s)\n", " return total_specimens" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download the data!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "specimens = get_specimen_totals(df_species.to_dict('records'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convert to a dataframe" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df_specimens = pd.DataFrame(specimens)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show the top twenty specimens by species!" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtaxon_namecommon_nametotal_specimens
211species/8463AmphipodaAmphipod20655
1184species/8483LeptoceridaeCaddisfly16639
1072species/8494LeptoceridaeCaddisfly larva16639
1103species/15127ChrysomelidaeEucalyptus Leaf Beetle11534
204species/8532CastiarinaJewel Beetle9626
208species/8480HydropsychidaeCaddisfly8340
1079species/8492HydropsychidaeCaddisfly larva8340
459species/15892OphiuridaBrittle Star8318
226species/8360Litoria ewingiiBrown Tree Frog6040
1196species/8468OstracodaSeed Shrimp5925
92species/8341Crinia signiferaCommon Eastern Froglet5666
1398species/15125IchneumonidaeNaN5404
243species/8395EulamprusWater Skink5081
213species/15891HolothuroideaNaN4858
101species/15886AnomuraNaN3427
28species/8365Litoria raniformisSouthern Bell Frog3029
1186species/8509PlanorbidaeFreshwater Snail3000
1221species/8425Antechinus agilisAgile Antechinus2966
255species/8396LampropholisGarden Skink2962
615species/8619ZoanthariaZoanthid2786
\n", "
" ], "text/plain": [ " id taxon_name common_name \\\n", "211 species/8463 Amphipoda Amphipod \n", "1184 species/8483 Leptoceridae Caddisfly \n", "1072 species/8494 Leptoceridae Caddisfly larva \n", "1103 species/15127 Chrysomelidae Eucalyptus Leaf Beetle \n", "204 species/8532 Castiarina Jewel Beetle \n", "208 species/8480 Hydropsychidae Caddisfly \n", "1079 species/8492 Hydropsychidae Caddisfly larva \n", "459 species/15892 Ophiurida Brittle Star \n", "226 species/8360 Litoria ewingii Brown Tree Frog \n", "1196 species/8468 Ostracoda Seed Shrimp \n", "92 species/8341 Crinia signifera Common Eastern Froglet \n", "1398 species/15125 Ichneumonidae NaN \n", "243 species/8395 Eulamprus Water Skink \n", "213 species/15891 Holothuroidea NaN \n", "101 species/15886 Anomura NaN \n", "28 species/8365 Litoria raniformis Southern Bell Frog \n", "1186 species/8509 Planorbidae Freshwater Snail \n", "1221 species/8425 Antechinus agilis Agile Antechinus \n", "255 species/8396 Lampropholis Garden Skink \n", "615 species/8619 Zoantharia Zoanthid \n", "\n", " total_specimens \n", "211 20655 \n", "1184 16639 \n", "1072 16639 \n", "1103 11534 \n", "204 9626 \n", "208 8340 \n", "1079 8340 \n", "459 8318 \n", "226 6040 \n", "1196 5925 \n", "92 5666 \n", "1398 5404 \n", "243 5081 \n", "213 4858 \n", "101 3427 \n", "28 3029 \n", "1186 3000 \n", "1221 2966 \n", "255 2962 \n", "615 2786 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Sort the dataframe by total_results then show a slice of the first 20 records\n", "df_specimens.sort_values(by='total_specimens', ascending=False)[:20]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What next?\n", "\n", "* How might you visualise these results?\n", "* Could we include other taxonomic data to group the species?\n", "* How could we get an image of each species (selected at random from matching specimens)? " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "Created by [Tim Sherratt](https://timsherratt.org/) for the [GLAM Workbench](https://glam-workbench.github.io/). Support me by becoming a [GitHub sponsor](https://github.com/sponsors/wragge)!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }