{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DOWNLOAD ALL THE ANTECHINUSES!\n",
    "\n",
    "In [another notebook](museumvic-find-specimens-of-each-species.ipynb) we discovered there are 2,883 Antechinus specimens in the Museum of Victoria. Let's see how many pictures we can find of them.\n",
    "\n",
    "![Antechinus total results](images/antechinus-totals.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import what we need"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "from slugify import slugify\n",
    "from pathlib import Path\n",
    "import os\n",
    "from tqdm.auto import tqdm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "SEARCH_URL = 'https://collections.museumsvictoria.com.au/api/search'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define some functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_totals(params):\n",
    "    '''\n",
    "    Get the total number of results and pages returned by a search.\n",
    "    '''\n",
    "    response = requests.get(SEARCH_URL, params=params, headers={'User-Agent': 'Mozilla/5.0'})\n",
    "    # The total results and pages values are in the API response's headers!\n",
    "    total_results = int(response.headers['Total-Results'])\n",
    "    total_pages = int(response.headers['Total-Pages'])\n",
    "    return (total_results, total_pages)\n",
    "\n",
    "def download_images(species, size='small'):\n",
    "    '''\n",
    "    Download all the available images for the given specimen.\n",
    "    '''\n",
    "    # Create a directory to save the images in based on the species name\n",
    "    image_path = Path('images', slugify(species))\n",
    "    image_path.mkdir(exist_ok=True, parents=True)\n",
    "    # Note the hasimages parameter to only get records with images\n",
    "    params = {\n",
    "        'recordtype': 'specimen',\n",
    "        'taxon': species,\n",
    "        'perpage': 100,\n",
    "        'hasimages': 'yes',\n",
    "        'sort': 'date'\n",
    "    }\n",
    "    _, total_pages = get_totals(params)\n",
    "    # Loop through the pages\n",
    "    for page in tqdm(range(1, total_pages + 1)):\n",
    "        response = requests.get(SEARCH_URL, params=params, headers={'User-Agent': 'Mozilla/5.0'})\n",
    "        data = response.json()\n",
    "        # Loop through the records\n",
    "        for record in data:\n",
    "            # Loop through the attached media (may be more than one image in a record)\n",
    "            for media in record['media']:\n",
    "                # Media items can be videos too\n",
    "                if media['type'] == 'image':\n",
    "                    url = media[size]['uri']\n",
    "                    # Get the current image name\n",
    "                    image_name = os.path.basename(url)\n",
    "                    # Add the specimen id to the image name\n",
    "                    image_file = Path(image_path, f'{slugify(record[\"id\"])}-{image_name}')\n",
    "                    response = requests.get(url)\n",
    "                    # Save the image\n",
    "                    image_file.write_bytes(response.content)               "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## DOWNLOAD THE ANTECHINUSES!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "download_images('Antechinus agilis')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Results\n",
    "\n",
    "The code above creates a directory under `images` for each downloaded species."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While only one of the 2,883 antechinus records had images attached, there were multiple images in that record.\n",
    "\n",
    "![Antechinus thumbnails](images/antechinus-images.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What next?\n",
    "\n",
    "Obviously this little image download example could be modified to download things other than Antechinuses!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----\n",
    "\n",
    "Created by [Tim Sherratt](https://timsherratt.org/) for the [GLAM Workbench](https://glam-workbench.github.io/).  Support me by becoming a [GitHub sponsor](https://github.com/sponsors/wragge)!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}