{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting higher resolution versions of photos from the State Library of South Australia online collection interface\n", "\n", "The State Library of South Australia makes a fabulous [collection of out of copyright photographs](https://www.slsa.sa.gov.au/photographs) available online. However, while you can zoom in using their collection interface to examine the details of many of these images, the download option seems to provide copies at a much lower resolution. This limits their usefulness for many types of research.\n", "\n", "This notebook simply takes the tiled versions of the images which are displayed in the collection interface and stitches them together to create higher resolution versions.\n", "\n", "For example, the version of [this photograph of Clement Wragge](https://collections.slsa.sa.gov.au/resource/B+43122) provided by the 'Download' button is 1024 x 787 pixels. The version created by this notebook is 5785 × 4337 pixels. \n", "\n", "Note that images available for download from the SLSA's [digital collections](https://digital.collections.slsa.sa.gov.au/) seem to be at a much higher resolution so don't need any special tricks to use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting things up\n", "\n", "Run these cells using **Shift+Enter** to get the code ready to use." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "import requests\n", "from PIL import Image\n", "from io import BytesIO\n", "from slugify import slugify\n", "import re\n", "from IPython.display import display, HTML, FileLink" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "def get_json(url):\n", " '''\n", " Get the JSON file that includes information about the zoom levels and tiles.\n", " '''\n", " json_url = '{}/{}'.format(url.rstrip('/'), 'tiles.json')\n", " response = requests.get(json_url)\n", " data = response.json()\n", " return data\n", "\n", "def get_highest_level(data):\n", " '''\n", " Find the highest level of zoom -- ie the biggest version of the image -- in the JSON data.\n", " '''\n", " for level in data['levels']:\n", " if level['name'] == 'z0':\n", " highest_zoom = level\n", " break\n", " return highest_zoom\n", "\n", "def download_image(url):\n", " '''\n", " Provide a url of a digitised photos, and get back the largest possible version for download.\n", " Gets information about available zoom levels and tiles, then stitches the tiles together.\n", " '''\n", " # Get data about levels\n", " data = get_json(url)\n", " # Get the highest zoom level\n", " level = get_highest_level(data)\n", " # Dimensions of the biggest image\n", " w = level['width']\n", " h = level['height']\n", " # Create an empty image to paste the tiles into\n", " img = Image.new('RGB', (w, h))\n", " # Loop through all the tiles\n", " for index, tile in enumerate(level['tiles']):\n", " # Get a tile and open as an image\n", " response = requests.get(tile['url'])\n", " tile_img = Image.open(BytesIO(response.content))\n", " # When we've got the first tile, grab the height and width\n", " if index == 0:\n", " tile_w, tile_h = tile_img.size\n", " # The tile data includes an x and y index value indicating the position of the tile\n", " # To calculate it's coordinates, just multiply the index by the width/height\n", " x = tile['x'] * tile_w\n", " y = tile['y'] * tile_h\n", " # Paste the tile into the big image using the x/y coords to define the top left corner\n", " img.paste(tile_img, box=(x, y))\n", " id = re.search(r'resource\\/(.*)', url).group(1)\n", " # Create file name that includes the image ID info\n", " image_name = 'slsa-{}.jpg'.format(slugify(id))\n", " # Save and display the image\n", " img.save(image_name)\n", " display(FileLink(image_name))\n", " display(HTML(''.format(image_name)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Supply the URL of the photo\n", "\n", "Just paste the url of the photo you want to download between the quotes in the cell below and run the cell using **Shift+Enter**. Once it has been created, the final image will be displayed below with a link for easy download." ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/html": [ "slsa-b-43122.jpg
" ], "text/plain": [ "/Users/tim/mycode/glam-workbench/slsa/notebooks/slsa-b-43122.jpg" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "download_image('https://collections.slsa.sa.gov.au/resource/B+43122')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }