{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Setting SmugMug Print Size and Geotag Keywords with Jupyter and Python\n", "=========================================\n", "\n", "![](jupysmug.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prerequistes\n", "\n", "This notebook assumes you have set up your environment to use `smugpyter.py`. \n", "Refer to this notebook for details on how to do this.\n", "\n", "[Getting Ready to use the SmugMug API with Python and Jupyter](https://github.com/bakerjd99/smugpyter/blob/master/notebooks/Getting%20Ready%20to%20use%20the%20SmugMug%20API%20with%20Python%20and%20Jupyter.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why am I doing this?\n", "\n", "Many years ago I wrote a little J verb `smuprintsizes` that computed \n", "the [largest standard SmugMug print sizes](https://analyzethedatanotthedrivel.org/2010/02/21/assigning-smugmug-print-size-keys/) when given image dimensions \n", "and the desired [DPI](https://en.wikipedia.org/wiki/Dots_per_inch). I used \n", "the output of this verb to set aspect ratio keywords for my SmugMug pictures until changes to \n", "SmugMug, particularly the introduction of [OAuth authentication](https://en.wikipedia.org/wiki/OAuth), broke my little SmugMug API application that called `smugprintsizes`. \n", "\n", "My print size keyword setter broke years ago but many of these keys still show up in my \n", "[\"top hundred\"](https://conceptcontrol.smugmug.com/keyword) keywords. \n", "\n", " 10x15 4x5 4x6 5x5 5x6.7 5x7 ...\n", " \n", "Print size keywords were very handy. They made it easy to select paper sizes for one or hundreds of pictures. \n", "This notebook will use the SmugMug API and Python to compute and set print size keywords." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### The Print Sizes Table\n", "\n", "`smugprintsizes` made use of the following table.\n", "\n", " ┌─────┬─────────┬──────────────┐\n", " │0.7 │17.5 70 │3.5x5 7x10 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.8 │20 80 │4x5 8x10 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.755│21.2 84.8│4x5.3 8x10.6 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.665│24 96 │4x6 8x12 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.5 │32 50 128│4x8 5x10 8x16 │\n", " ├─────┼─────────┼──────────────┤\n", " │1 │25 64 100│5x5 8x8 10x10 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.745│33.5 │5x6.7 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.715│35 │5x7 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.165│150 │5x30 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.4 │160 │8x20 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.775│93.5 │8.5x11 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.75 │108 │9x12 │\n", " ├─────┼─────────┼──────────────┤\n", " │0.77 │130 │10x13 │\n", " └─────┴─────────┴──────────────┘\n", " \n", "The first column is the `Short/Long` image aspect ratio rounded to 0.005. The middle column \n", "lists areas in square inches of the corresponding print sizes in the last column.\n", "\n", "This table uses inches but the algorithm doesn't care about units. You can easily\n", "use metric values.\n", "\n", "Finding the largest DPI dependent print size is simple matter of:\n", "\n", "1. Divide the short image dimension by the long image dimension and round to 0.005.\n", " This is the aspect ratio.\n", "\n", "2. Search for an aspect ratio match in the first column. Many images will not match.\n", " Quit and return `0z1` for no aspect match. The `0zN` codes are similiar to \n", " the `NxM` print sizes codes. This will be important in later notebooks.\n", "\n", "3. If a match is found compute the print area required for a given DPI and round to 0.5.\n", "\n", "4. Find the index of the largest area in the second column that is greater than or equal to the required \n", " area computed in the previous step. If there are not enough pixels no area will meet this criterion.\n", " Quit and return `0z0` for not enough pixels. \n", " \n", "5. If an area is found select and return the corresponding print size in the last column. Finally, if\n", " the DPI area exceeds all areas for an aspect ratio return the largest print size.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An image with dimensions of 2389 x 3344 has enough pixels to make a standard 5x7 inch 360 DPI print. \n", "It does not have enough pixels to make a 5x7 inch 720 DPI print. \n", "\n", "Print resolution is a hot button issue for photographers. How many dots (DPI) or pixels (PPI) are \n", "required depends on many factors, viewing distance, illumination, image colors, paper gloss and so on. \n", "Human vision tests have demonstrated that young people with excellent eyesight can tell the difference\n", "between 500 DPI and 600 DPI prints. Resolutions beyond 600 DPI are mostly wasted unless you are using loupes or microscopes.\n", "[According to Dr. Optoglass](https://wolfcrow.com/blog/notes-by-dr-optoglass-the-resolution-of-the-human-eye/):\n", "\n", ">*If the average reading distance is 1 foot (12 inches = 305 mm), p @0.4 arc minute is 35.5 microns or about 720 ppi/dpi. p @1 arc minute is 89 microns or about 300 dpi/ppi. This is why magazines are printed at 300 dpi – it’s good enough for most people. Fine art printers aim for 720, and that’s the best it need be. Very few people stick their heads closer than 1 foot away from a painting or photograph.*\n", "\n", "Digital printers complicate DPI issues by applying sophisticated resizing algorithms that can turn low resolution \n", "originals into plausible higher resolution copies. I've found that 360 DPI is a good starting point for SmugMug prints.\n", "For exceptional images you can simply divide the 360 DPI image dimensions by two for 720 DPI printing. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Computing DPI Dependent Print Area\n", "\n", "The use of the print size table is clear with the exception of computing the print area required for a given DPI.\n", "`dpi_area` computes DPI dependent print area." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "aspect ratio 0.715\n", "area at 360 dpi 61.5\n", "area at 720 dpi 15.5\n" ] } ], "source": [ "def round_to(n, precision):\n", " correction = 0.5 if n >= 0 else -0.5\n", " return int( n/precision+correction ) * precision\n", "\n", "def aspect_ratio(height, width, *, precision=0.005):\n", " return round_to( min(height, width) / max(height, width), precision )\n", "\n", "def dpi_area(height, width, *, dpi=360, precision=0.5):\n", " return round_to( (height * width) / dpi ** 2, precision )\n", "\n", "# image pixel dimensions - order is immaterial\n", "height, width = 2389 , 3344\n", "\n", "print('aspect ratio %s' % aspect_ratio(height, width))\n", "print('area at 360 dpi %s' % dpi_area(height, width))\n", "print('area at 720 dpi %s' % dpi_area(height, width, dpi=720))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Representing the Print Size table\n", "\n", "There are many ways to encode the print size table. I am starting with the simplest possible representation: three lists,\n", "one for each column.\n", "\n", "The lists must have the same number of items. Eventually, these details will be hidden within a `SmugPyter` subclass \n", "that manages the details of creating and using print size tables. For now let's build the lists from a simple string." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import smugpyter\n", "smugmug = smugpyter.SmugPyter()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['3.5x5', '4x5', '4x5.3', '4x6', '4x8', '5x5', '5x6.7', '5x7', '5x10', '5x30', '7x10', '8x8', '8x10', '8x10.6', '8x12', '8x16', '8x20', '8.5x11', '9x12', '10x10', '10x13', '10x15', '10x16', '10x20', '10x30', '11x14', '11x16', '11x28', '12x12', '12x18', '12x20', '12x24', '12x30', '16x20', '16x24', '18x24', '20x20', '20x24', '20x30']\n" ] } ], "source": [ "# list of all known small to medium SmugMug print sizes\n", "smug_print_sizes = \"\"\"\n", " 3.5x5 4x5 4x5.3 4x6 4x8 \n", " 5x5 5x6.7 5x7 5x10 5x30 \n", " 7x10 8x8 8x10 8x10.6 8x12 \n", " 8x16 8x20 8.5x11 9x12 10x10 \n", " 10x13 10x15 10x16 10x20 10x30 \n", " 11x14 11x16 11x28 12x12 12x18 \n", " 12x20 12x24 12x30 16x20 16x24 \n", " 18x24 20x20 20x24 20x30 \n", "\"\"\"\n", "\n", "# clean up the usual suspects\n", "smug_print_sizes = smugmug.purify_smugmug_text(smug_print_sizes).split()\n", "print(smug_print_sizes)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.7000000000000001, 0.8, 0.755, 0.665, 0.5, 1.0, 0.745, 0.715, 0.165, 0.4, 0.775, 0.75, 0.625, 0.335, 0.6900000000000001, 0.77, 0.395, 0.6, 0.835, 0.785]\n", "[17.5, 20.0, 21.2, 24.0, 32.0, 25.0, 33.5, 35.0, 50.0, 150.0, 70.0, 64.0, 80.0, 84.8, 96.0, 128.0, 160.0, 93.5, 108.0, 100.0, 130.0, 150.0, 160.0, 200.0, 300.0, 154.0, 176.0, 308.0, 144.0, 216.0, 240.0, 288.0, 360.0, 320.0, 384.0, 432.0, 400.0, 480.0, 600.0]\n" ] } ], "source": [ "all_aspect_ratios = []\n", "all_print_areas = []\n", "\n", "for size in smug_print_sizes:\n", " height , width = size.split('x')\n", " height = float(height) \n", " width = float(width)\n", " ratio = aspect_ratio(height, width)\n", " area = height * width\n", " all_aspect_ratios.append(ratio)\n", " all_print_areas.append(area)\n", " \n", "aspect_ratios = list(set(all_aspect_ratios))\n", "print(aspect_ratios)\n", "print(all_print_areas)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def dualsort(a, b):\n", " \"\"\"\n", " Sort lists (a) and (b) using (a) to grade (b).\n", " \"\"\"\n", " temp = sorted(zip(a, b), key=lambda x: x[0])\n", " return list(map(list, zip(*temp)))\n", "\n", "# group areas and keys by ratios\n", "gpa = []\n", "gsk = []\n", "for ur in aspect_ratios:\n", " gp = []\n", " gk = []\n", " for ar, pa, sk in zip(all_aspect_ratios, all_print_areas, smug_print_sizes):\n", " if ur == ar:\n", " gp.append(pa)\n", " gk.append(sk)\n", " # insure sublists are sorted by ascending area\n", " gp , gk = dualsort(gp, gk)\n", " gpa.append(gp)\n", " gsk.append(gk)\n", "\n", "print_areas = gpa\n", "size_keywords = gsk" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.7000000000000001, 0.8, 0.755, 0.665, 0.5, 1.0, 0.745, 0.715, 0.165, 0.4, 0.775, 0.75, 0.625, 0.335, 0.6900000000000001, 0.77, 0.395, 0.6, 0.835, 0.785]\n", "20\n" ] } ], "source": [ "#aspect_ratios = [0.7, 0.8, 0.755, 0.665, 0.5, 1, 0.745, 0.715, \n", "# 0.165, 0.4, 0.775, 0.75, 0.77]\n", "print(aspect_ratios)\n", "print(len(aspect_ratios))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[17.5, 70.0], [20.0, 80.0, 320.0], [21.2, 84.8], [24.0, 96.0, 150.0, 216.0, 384.0, 600.0], [32.0, 50.0, 128.0, 200.0, 288.0], [25.0, 64.0, 100.0, 144.0, 400.0], [33.5], [35.0], [150.0], [160.0, 360.0], [93.5], [108.0, 432.0], [160.0], [300.0], [176.0], [130.0], [308.0], [240.0], [480.0], [154.0]]\n", "20\n" ] } ], "source": [ "#print_areas = [[17.5,70],[20,80],[21.2,84.8],[24,96],[32,50,128],\n", "# [25,64,100],[33.5],[35],[150],[160],[93.5],[108],[130]]\n", "print(print_areas)\n", "print(len(print_areas))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Minimum Print Size Area\n", "\n", "Any image with a `dpi_area` below the minimum print size table area does not have enough pixels to print. \n", "It's useful to know this value. The following `flatten` function \n", "from [Recipe 4.14, Python Cookbook 3rd Ed](https://www.safaribooksonline.com/library/view/python-cookbook-3rd/9781449357337/) makes it easy to extract this value." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "17.5\n" ] } ], "source": [ "from collections import Iterable\n", "\n", "def flatten(items):\n", " \"\"\"Yield items from any nested iterable; see REF.\"\"\"\n", " for x in items:\n", " if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):\n", " yield from flatten(x)\n", " else:\n", " yield x\n", " \n", "min_print_area = min(list(flatten(print_areas)))\n", "print(min_print_area)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[['3.5x5', '7x10'], ['4x5', '8x10', '16x20'], ['4x5.3', '8x10.6'], ['4x6', '8x12', '10x15', '12x18', '16x24', '20x30'], ['4x8', '5x10', '8x16', '10x20', '12x24'], ['5x5', '8x8', '10x10', '12x12', '20x20'], ['5x6.7'], ['5x7'], ['5x30'], ['8x20', '12x30'], ['8.5x11'], ['9x12', '18x24'], ['10x16'], ['10x30'], ['11x16'], ['10x13'], ['11x28'], ['12x20'], ['20x24'], ['11x14']]\n", "20\n" ] } ], "source": [ "#size_keywords = [['3.5x5','7x10'],['4x5','8x10'],['4x5.3','8x10.6'],\n", "# ['4x6','8x12'],['4x8','5x10', '8x16'],['5x5','8x8','10x10'],['5x6.7'],\n", "# ['5x7'],['5x30'],['8x20'],['8.5x11'],['9x12'],['10x13']]\n", "print(size_keywords)\n", "print(len(size_keywords))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3800x3800 at 360 DPI = 10x10\n", "3800x3800 at 720 DPI = 5x5\n", "3000x3000 at 360 DPI = 8x8\n", "2000x2000 at 360 DPI = 5x5\n", "500x500 at 360 DPI = 0z0\n", "10x10 at 360 DPI = 0z0\n", "3255x4119 at 360 DPI = 0z1\n" ] } ], "source": [ "def print_size_key(height, width, *, no_ratio='0z1', no_pixels='0z0', \n", " min_area=17.5, ppi=360, tolerance=0.000005):\n", " \"\"\"\n", " Compute print size key word from image dimensions. \n", " The result is a character string.\n", " \n", " key360 = print_size_key(2000, 3000)\n", " \n", " # (ppi) is identical to dpi here\n", " key720 = print_size_key(2000, 3000, ppi=720) \n", " \"\"\"\n", " \n", " # basic argument check\n", " error_message = '(height), (width) must be positive integers'\n", " if not (isinstance(height, int) and isinstance(width, int)):\n", " raise TypeError(error_message)\n", " elif height <= 0 or width <= 0:\n", " raise ValueError(error_message)\n", " \n", " # area must exceed a minimum size\n", " print_area = dpi_area(height, width, dpi=ppi)\n", " if print_area < min_area:\n", " return no_pixels\n", " \n", " print_ratio = aspect_ratio(height, width)\n", " print_key = no_ratio\n", " for i, ratio in enumerate(aspect_ratios):\n", " if abs(print_ratio - ratio) <= tolerance:\n", " print_key = no_pixels\n", " \n", " # not enough or more than enough area\n", " if print_area < print_areas[i][0]:\n", " break\n", " elif print_area > print_areas[i][-1]:\n", " print_key = size_keywords[i][-1]\n", " break \n", " \n", " for j, area in enumerate(print_areas[i]):\n", " if area >= print_area and 0 < j:\n", " print_key = size_keywords[i][j - 1]\n", " break\n", " \n", " return print_key\n", " \n", "# many sizes available for aspect ratio 1.0\n", "print('3800x3800 at 360 DPI = %s' % print_size_key(3800, 3800))\n", "print('3800x3800 at 720 DPI = %s' % print_size_key(3800, 3800, ppi=720))\n", "print('3000x3000 at 360 DPI = %s' % print_size_key(3000, 3000))\n", "print('2000x2000 at 360 DPI = %s' % print_size_key(2000, 2000))\n", "\n", "# not enough pixels\n", "print('500x500 at 360 DPI = %s' % print_size_key(500,500))\n", "print('10x10 at 360 DPI = %s' % print_size_key(10,10)) \n", "\n", "# no ratio \n", "print('3255x4119 at 360 DPI = %s' % print_size_key(3255, 4119))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Testing `print_size_key`\n", "\n", "The `print_size_key` function seems simple enough but when I see three `break` statements in a loop\n", "I set my bullshit detector to \n", "[eleven](https://duckduckgo.com/?q=you+tube+loudness+to+eleven&ia=videos&iax=videos&iai=4xgx4k83zzc) and start \n", "looking for bugs." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# exception throwing blocks rerunning all notebook cells\n", "# print_size_key('not', 'even_wrong') # throw exception" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# print_size_key(-2, -3) # throw exception" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# print_size_key(0, 50) # throw exception" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "True\n" ] } ], "source": [ "print('0z0' == print_size_key(1,1)) # not enough pixels\n", "print('0z0' == print_size_key(20,20)) # not enough pixels\n", "print('0z0' == print_size_key(500,500)) # not enough pixels" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "True\n" ] } ], "source": [ "print('0z1' == print_size_key(2000,2100)) # ratio not in table\n", "print('0z1' == print_size_key(4000,3500)) # ratio not in table\n", "print('0z1' == print_size_key(1000,5000)) # ratio not in table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As `print_size_key` rounds ratios and areas you need slightly more pixels than you might expect \n", "for a given print size. In practice this is not an issue as digital images usually have\n", "more than enough pixels for small standard size prints." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n", "True\n" ] } ], "source": [ "print('0z0' == print_size_key(int(3.5 * 350), 5 * 350)) # 3.5x5 not enough pixels\n", "print('3.5x5' == print_size_key(int(3.5 * 362), 5 * 362)) # 3.5x5\n", "print('7x10' == print_size_key(7 * 362, 10 * 362)) # 7x10\n", "print('5x6.7' == print_size_key(5 * 362, int(6.7 * 362))) # 5x6.7\n", "print('8.5x11' == print_size_key(int(8.5 * 362), 11 * 362)) # 8.5x11\n", "print('10x10' == print_size_key(10 * 362, 10 * 362)) # 10x10\n", "print('10x10' == print_size_key(10 * 722, 10 * 722, ppi=720)) # 10x10 at 720 DPI\n", "print('5x30' == print_size_key(5 * 362, 30 * 362)) # 5x30\n", "print('5x10' == print_size_key(5 * 722, 10 * 722, ppi=720)) # 5x10 at 720 DPI" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0z1\n", "0z1\n", "5x7\n", "5x7\n" ] } ], "source": [ "# selected actual SmugMug image dimensions\n", "print(print_size_key(2396,1991)) \n", "print(print_size_key(2585,1736))\n", "print(print_size_key(4573,3259))\n", "print(print_size_key(2800,1999))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculating Print Size Keys for SmugMug Album Manifest Files\n", "\n", "In the first notebook of this series I used the SmugMug API to generate folders and files\n", "containing SmugMug image metadata stored in CSV TAB delimited files. Now I will read these manifest files and\n", "compute print size keys." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4wqd5Hr 4x6 3021 2014\n", "K7JKbs8 0z1 2036 3122\n", "nFRxBh2 5x7 2665 3731\n", "xCdD7V8 0z1 2585 1736\n", "sTXnpLm 4x6 2192 3289\n", "VG2s4WG 5x7 3659 2613\n", "kNRs3X8 4x6 1694 2543\n", "Qjs2hr6 4x6 3848 2559\n", "qbXqVgC 4x6 2633 3949\n", "ZdzNXm3 0z1 1162 2506\n", "vF4Bwpg 5x7 2531 3542\n", "7WbqpMj 4x5 3211 2566\n", "2cCVDMK 0z0 1846 2398\n", "36kBgrv 0z1 2396 1991\n", "2FzVqjP 0z0 1887 2398\n" ] } ], "source": [ "import csv\n", "\n", "with open('c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt', 'r') as f:\n", " reader = csv.DictReader(f, dialect='excel', delimiter='\\t') \n", " for row in reader:\n", " key = row['ImageKey']\n", " height , width = int(row['OriginalHeight']), int(row['OriginalWidth'])\n", " size_key = print_size_key(height, width)\n", " print(key, size_key, height, width)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The print size keys computed by the Python `print_size_key` function match the keys computed by\n", "the following J verb `printsizekey`. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " printsizekey=:3 : 0\n", "\n", " NB.*printsizekey v-- j version of python (print_size_key).\n", " NB.\n", " NB. monad: st =. printsizekey btclManifest\n", " NB.\n", " NB. mf0=. readtd2 'c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt'\n", " NB. mf1=. readtd2 'c:\\SmugMirror\\Themes\\Diaries\\CellPhoningItIn\\manifest-CellPhoningItIn-PfCsJz-16.txt'\n", " NB. printsizekey mf0\n", " NB. printsizekey mf1\n", " NB.\n", " NB. dyad: st =. iaDpi printsizekey btclManifest\n", " NB.\n", " NB. 720 printsizekey mf1\n", "\n", " SMUGPRINTDPI printsizekey y\n", " :\n", " NB. image keys and dimensions \n", " d=. y {\"1~ (0{y) i. ;:'ImageKey OriginalHeight OriginalWidth'\n", " f=. |: _1&\".&> d=. 1 2 {\"1 }. d\n", " 'invalid image dimensions' assert 0 < ,f\n", "\n", " NB. default print size keys\n", " 'area ratio'=. (SMUGASPECTROUND,SMUGAREAROUND,x) dpiarearatio f \n", " keys=. (#ratio) # s: area\n", " m2=. +./&> m1\n", " keys=. (s: m2#m1) {&> 2 {\"1 m2#pst\n", " keys=. sizes(I. m0 #^:_1 m2)} keys\n", "\n", " NB. image keys, print size keys, pixels\n", " NB. smoutput (<\"0 m0 # keys) ,. area ,. pst \n", " (s: }.0 {\"1 y) , keys , |: s: d \n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Invoking J within Jupyter\n", "\n", "Using a [J addon](http://code.jsoftware.com/wiki/Vocabulary/Libraries) we can run the J verb and compare its output to the Python result. The next cell assumes `jcore.py` and `jbase.py` are on Python's `sys.path`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 5 6\n", "7 8 9 10 11 12 13\n" ] } ], "source": [ "import jcore as j\n", "\n", "j.init(True) # start j\n", "j.dor('i. 2 7') # ping j" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Open the [JOD Dictionary](https://analyzethedatanotthedrivel.org/the-jod-page/) that contains `printsizekeys` and fetch the words required to run it." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+-+--------------------------+---------+-------+-----+----+-----+\n", "|1|opened (rw/ro/ro/ro/ro) ->|smugpyter|smugdev|image|smug|utils|\n", "+-+--------------------------+---------+-------+-----+----+-----+\n", "+-+------------------------------+\n", "|1|(16) words loaded into -> base|\n", "+-+------------------------------+\n" ] } ], "source": [ "j.dor(\"require 'general/jod'\") # load JOD addon\n", "j.dor(\"od ;:'smugpyter smugdev image smug utils' [ 3 od '' \") # open image dictionaries\n", "j.dor(\"getrx ;:'printsizekey fmtkeys'\") # get everything required to execute" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.5x5 4x5 4x5.3 4x6 4x8 \n", "5x5 5x6.7 5x7 5x10 5x30 \n", "7x10 8x8 8x10 8x10.6 8x12 \n", "8x16 8x20 8.5x11 9x12 10x10 \n", "10x13 10x15 10x16 10x20 10x30 \n", "11x14 11x16 11x28 12x12 12x18 \n", "12x20 12x24 12x30 16x20 16x24 \n", "18x24 20x20 20x24 20x30 \n" ] } ], "source": [ "j.dor('35 list SMUGPYTERSIZES') # show printsizes table in J " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read the manifest file into J and compute the print size keys." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4wqd5Hr 4x6 3021 2014\n", "K7JKbs8 0z1 2036 3122\n", "nFRxBh2 5x7 2665 3731\n", "xCdD7V8 0z1 2585 1736\n", "sTXnpLm 4x6 2192 3289\n", "VG2s4WG 5x7 3659 2613\n", "kNRs3X8 4x6 1694 2543\n", "Qjs2hr6 4x6 3848 2559\n", "qbXqVgC 4x6 2633 3949\n", "ZdzNXm3 0z1 1162 2506\n", "vF4Bwpg 5x7 2531 3542\n", "7WbqpMj 4x5 3211 2566\n", "2cCVDMK 0z0 1846 2398\n", "36kBgrv 0z1 2396 1991\n", "2FzVqjP 0z0 1887 2398\n" ] } ], "source": [ "j.dor(\"mf0=. readtd2 'c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt'\")\n", "j.dor('fmtkeys printsizekey mf0')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The J verb and the Python function use completely different approaches but arrive at\n", "the same result. ***If you really care about the answer do it more than once and practice relentless verification!***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following functions generalize setting print size keywords for manifest files." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['0x0', 'ahead', 'boo_hoo', 'go', 'test_me', 'usa', 'utterly_wrong', 'you_are_so']\n", "[]\n", "[]\n", "['all_right', 'alll_right', 'allll_right']\n" ] } ], "source": [ "test0 = 'go;ahead;test me;boo hoo ; you are so; 0x0; utterly wrong ; united states'\n", "test1 = 'all_right; alll_right; allll_right'\n", "\n", "def standard_keywords(keywords, *, blank_fill='_', \n", " split_delimiter=';',\n", " substitutions=[('united_states','usa')]):\n", " \"\"\"\n", " Return a list of keywords in standard form.\n", " \n", " Reduces multiple blanks to one, converts to lower case, and replaces\n", " any remaining blanks with (blank_fill). This insures keywords are contigous\n", " lower case or hypenated lower case character runs.\n", " \n", " Note: the odd choice of '_' for the blank fill is because hyphens appear\n", " to be stripped from keywords on SmugMug.\n", " \n", " standard_keywords('go;ahead;test me;boo hoo ; you are so; 0x0; united states')\n", " \"\"\"\n", " \n", " # basic argument check\n", " error_message = '(keywords) must be a string'\n", " if not isinstance(keywords, str):\n", " raise TypeError(error_message)\n", " \n", " if len(keywords.strip(' ')) == 0:\n", " return []\n", " else:\n", " keys = ' '.join(keywords.split()) \n", " keys = split_delimiter.join([s.strip().lower() for s in keys.split(split_delimiter)])\n", " keys = ''.join(blank_fill if c == ' ' else c for c in keys)\n", " # replace some keywords with others\n", " for k, s in substitutions:\n", " keys = keys.replace(k, s)\n", " # return sorted list - move size keys to front \n", " keylist = [s for s in keys.split(split_delimiter)]\n", " return sorted(keylist)\n", "\n", "print(standard_keywords(test0))\n", "print(standard_keywords(''))\n", "print(standard_keywords(' '))\n", "print(standard_keywords(test1))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import re\n", "\n", "def update_size_keyword(size_keyword, keywords, split_delimiter=';'):\n", " \"\"\"\n", " Update the print size keyword for a single image\n", " and standardize the format of any remaining keywords.\n", " Result is a (boolean, string) tuple.\n", " \"\"\"\n", " # basic argument check\n", " error_message = '(size_keyword), (keywords) must be nonempty strings'\n", " if not (isinstance(size_keyword, str) and isinstance(keywords, str)):\n", " raise TypeError(error_message)\n", " elif len(size_keyword.strip(' ')) == 0:\n", " raise ValueError(error_message)\n", " \n", " if len(keywords.strip(' ')) == 0:\n", " return (False, size_keyword)\n", " \n", " inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]\n", " if 0 == len(inkeys):\n", " return (False, size_keyword)\n", " \n", " outkeys = [size_keyword]\n", " for inword in inkeys:\n", " # remove any existing print size keys\n", " if re.match(r\"\\d+(\\.\\d+)?[xz]\\d+(\\.\\d+)?\", inword) is not None:\n", " continue\n", " else:\n", " outkeys.append(inword)\n", " \n", " # return standard unique sorted keys\n", " outkeys = sorted(list(set(outkeys)))\n", " outkeys = standard_keywords(split_delimiter.join(outkeys))\n", " return (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))\n", "\n", "def print_keywords(manifest_file):\n", " \"\"\"\n", " Set print size keywords for images in album manifest file.\n", " Result is a tuple (image_count, change_count, changed_keywords).\n", " (changed_keyords) is a list of dictionaries in (csv.DictWriter) format.\n", " \"\"\"\n", " changed_keywords = []\n", " image_count , change_count = 0 , 0\n", " with open(manifest_file, 'r') as f:\n", " reader = csv.DictReader(f, dialect='excel', delimiter='\\t') \n", " for row in reader:\n", " image_count += 1\n", " key = row['ImageKey']\n", " height , width = int(row['OriginalHeight']), int(row['OriginalWidth'])\n", " size_key = print_size_key(height, width)\n", " same, keywords = update_size_keyword(size_key, row['Keywords'])\n", " if not same:\n", " change_count += 1\n", " changed_keywords.append({'ImageKey': key, 'AlbumKey': row['AlbumKey'],\n", " 'FileName': row['FileName'], 'Keywords': keywords,\n", " 'OldKeywords': row['Keywords']})\n", " \n", " # when no images are changed return a header place holder row\n", " if change_count == 0:\n", " changed_keywords.append({'ImageKey': None, 'AlbumKey': None, 'FileName': None, \n", " 'Keywords': None, 'OldKeywords': None})\n", " \n", " return (image_count, change_count, changed_keywords)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(15,\n", " 0,\n", " [{'AlbumKey': None,\n", " 'FileName': None,\n", " 'ImageKey': None,\n", " 'Keywords': None,\n", " 'OldKeywords': None}])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print_keywords('c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Testing `update_size_keyword`" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# update_size_keyword('4x5', 3) # throw exception" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# update_size_keyword('', ' ok; but; size; key; bad') # throw exception" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n", "True\n", "True\n", "True\n", "True\n" ] } ], "source": [ "print('4x6' == update_size_keyword('4x6', ' ')[1])\n", "print('4x6; boo' == update_size_keyword('4x6', 'boo')[1]) \n", "print('4x6; aha; boo; boys' == update_size_keyword('4x6', 'aha; boo; BOO; boo; boys')[1])\n", "print('4x6' == update_size_keyword('4x6', '5x7; 8x12; 3x4; 3.5x5')[1]) \n", "print('4x6; boo; home; yo' == update_size_keyword('4x6', '5x7; 8x12; 3x4; 3.5x5; yo; yo; home; BOO')[1])\n", "print(update_size_keyword('4x6', '4x6; boo; hoo; too')[0]) # no keyword changes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Posting SmugMug Print Size Keywords\n", "\n", "The next step is to post the computed print size keywords to SmugMug. For this, we need\n", "an API call that sets keywords. The `SmugPyter` class does have a keyword setting function. \n", "We will have to fake it.\n", "\n", "In case you are wondering, faking it is a fundamental skill that all programmers must master.\n", "Remember how Scotty in the original Star Trek series constantly told Kirk that he couldn't\n", "sustain high warp without wreaking the Enterprise but somehow always managed to do it and walk away\n", "intact. Sure the Enterprise wasn't designed for the stresses it was forced to endure but Scotty\n", "hacked it on the fly. \n", "\n", "A lot of programming is like that. You're working with half-baked buggy tools that will not\n", "sustain warp but you have to pull it off. Be grateful you're not dodging photon torpedoes." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Kng6tg', 'Ghana1970s', 'w')" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import os \n", "\n", "def album_id_from_file(filename):\n", " \"\"\"\n", " Extracts the (album_id, name, mask) from file names. \n", " Depends on file naming conventions.\n", " \n", " album_id_from_file('c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt') \n", " \"\"\"\n", " mask, album_id, name = filename.split('-')[::-1][:3]\n", " mask = mask.split('.')[0]\n", " return (smugmug.case_mask_decode(album_id, mask), name, mask)\n", "\n", "manifest_file = 'c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt'\n", "album_id_from_file(manifest_file)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(15, 0)" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def changes_filename(manifest_file):\n", " \"\"\"\n", " Changes file name from manifest file name.\n", " \"\"\"\n", " album_id, name, mask = album_id_from_file(manifest_file)\n", " path = os.path.dirname(manifest_file)\n", " changes_name = \"changes-%s-%s-%s\" % (name, album_id, mask)\n", " changes_file = path + \"/\" + changes_name + '.txt'\n", " return changes_file\n", " \n", "def write_size_keyword_changes(manifest_file):\n", " \"\"\"\n", " Write TAB delimited file of changed metadata.\n", " Return album and keyword (image_count, change_count) tuple.\n", " \n", " manifest_file = 'c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt'\n", " write_size_keyword_changes(manifest_file) \n", " \"\"\"\n", " image_count, change_count, keyword_changes = print_keywords(manifest_file)\n", " changes_file = changes_filename(manifest_file)\n", " keys = keyword_changes[0].keys()\n", " with open(changes_file, 'w', newline='') as output_file:\n", " dict_writer = csv.DictWriter(output_file, keys, dialect='excel-tab')\n", " dict_writer.writeheader()\n", " # for no changes write header only\n", " if change_count > 0:\n", " dict_writer.writerows(keyword_changes) \n", " return(image_count, change_count)\n", " \n", "write_size_keyword_changes(manifest_file)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def update_all_keyword_changes_files(root):\n", " \"\"\"\n", " Scan all manifest files in local directories and\n", " generate TAB delimited CSV keyword changes files.\n", " \"\"\"\n", " total_images , total_changes = 0 , 0\n", " pattern = \"manifest-\"\n", " alist_filter = ['txt'] \n", " for r,d,f in os.walk(root):\n", " for file in f:\n", " if file[-3:] in alist_filter and pattern in file:\n", " file_name = os.path.join(root,r,file)\n", " image_count, change_count = write_size_keyword_changes(file_name)\n", " if change_count > 0:\n", " print(file_name)\n", " total_images += image_count\n", " total_changes += change_count\n", " print('image count %s, change count %s' % (total_images, total_changes))" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "image count 4254, change count 0\n" ] } ], "source": [ "# %timeit update_all_keyword_changes_files('c:\\SmugMirror')\n", "update_all_keyword_changes_files('c:\\SmugMirror')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Issuing SmugMug API `PATCH` Requests\n", "\n", "Now that the `CSV` change files are ready the next step is to read them and reset keywords. You can do this with a SmugMug `PATCH` request. \n", "\n", "My attempts to issue `PATCH` requests did not meet with a lot of success until I traded a few emails with the SmugMug API support team at `api@smugmug.com`. They advised me to turn off redirects. It was a simple parameter setting but it would have taken me days to figure it on my own. Kudos to the excellent API support at SmugMug." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import requests\n", "import json\n", "from requests_oauthlib import OAuth1" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": true }, "outputs": [], "source": [ "auth = OAuth1(smugmug.consumer_key, smugmug.consumer_secret, \n", " smugmug.access_token, smugmug.access_token_secret, smugmug.username)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# attempt to set keywords\n", "r = requests.patch(url='https://api.smugmug.com/api/v2/image/8rjZsTB',\n", " auth=auth,\n", " data=json.dumps({\"Keywords\": \"these; are; brand; spanking; new; keywords\"}),\n", " headers={'Accept':'application/json','Content-Type':'application/json'},\n", " allow_redirects=False)\n", "\n" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def change_image_keywords(image_id, keywords):\n", " r = requests.patch(url='https://api.smugmug.com/api/v2/image/' + image_id,\n", " auth=auth,\n", " data=json.dumps({\"Keywords\": keywords}),\n", " headers={'Accept':'application/json','Content-Type':'application/json'},\n", " allow_redirects=False)\n", " if r.status_code != 301:\n", " raise Exception(\"Not what the doctor ordered\")\n", " \n", " return 'changed'\n", " " ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'changed'" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "change_image_keywords('8rjZsTB', 'more; new; keywords; ehh')" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def change_keywords(changes_file):\n", " \"\"\"\n", " Change keywords for images in album changes file.\n", " \"\"\"\n", " change_count = 0\n", " with open(changes_file, 'r') as f:\n", " reader = csv.DictReader(f, dialect='excel', delimiter='\\t') \n", " for row in reader:\n", " change_count += 1\n", " image_key = row['ImageKey']\n", " keywords = row['Keywords']\n", " #print(key, keywords)\n", " change_image_keywords(image_key, keywords)\n", " return change_count\n", "\n", "change_keywords('c:/SmugMirror/Other/utilimages/changes-utilimages-GMLn9k-1k.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once an album's print size keywords have been changed regenerating the print size keywords changes files should result in a file with no pending changes.\n", "\n", "*Note: posted keyword changes appear to become immediately active on SmugMug but immediately re-pulling them returns the prior keyword list. This may be a SmugMug server update issue. I will check later.*\n", "\n", "*P.S. it takes a day or two for all keyword changes to percolate through SmugMug's servers. When I rescanned keywords a day or so after a mass update all my change files were emptied. This is exactly what I was expecting.*" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(107, 0)" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "write_size_keyword_changes('c:/SmugMirror/Other/utilimages/manifest-utilimages-GMLn9k-1k.txt')" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def update_all_keyword_changes(root):\n", " \"\"\"\n", " Scan all changes files in local directories\n", " and apply keyword changes.\n", " \"\"\"\n", " total_changes = 0\n", " pattern = \"changes-\"\n", " alist_filter = ['txt'] \n", " for r,d,f in os.walk(root):\n", " for file in f:\n", " if file[-3:] in alist_filter and pattern in file:\n", " change_count = change_keywords(os.path.join(root,r,file))\n", " total_changes += change_count\n", " print('change count %s' % total_changes)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "change count 0\n" ] } ], "source": [ "# takes awhile to plow through thousands of updates\n", "update_all_keyword_changes('c:\\SmugMirror')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting a `geotagged` Keyword\n", "\n", "Now that we can easily set keywords. It's a simple matter to scan the manifest files and set a `geotagged` keyword for all images that have nonzero latitude and longitude. The most common latitude, longitude and altitude value in the manifest files is the default `(0,0,0)`. \n", "If you [look at a map](https://www.google.com/maps/place/0%C2%B000'00.0%22N+0%C2%B000'00.0%22E/@-2.4635807,4.1955676,4.5z/data=!4m2!3m1!1s0x0:0x0?hl=en) \n", "you'll see this coordinate is in Atlantic ocean off the west coast of Africa. I have taken exactly zero pictures at this location." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def geotag_images(manifest_file, *, split_delimiter=';', geotag_key='geotagged'):\n", " \"\"\"\n", " Sets a geotagged keyword for nongeotagged images with nonzero latitude or longitude.\n", " \"\"\"\n", " change_count = 0\n", " with open(manifest_file, 'r') as f:\n", " reader = csv.DictReader(f, dialect='excel', delimiter='\\t') \n", " for row in reader:\n", " key = row['ImageKey']\n", " latitude = float(row['Latitude'])\n", " longitude = float(row['Longitude'])\n", " if latitude != 0.0 or longitude != 0.0:\n", " keywords = row['Keywords']\n", " inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]\n", " \n", " # if an image is already geotagged skip it \n", " if geotag_key in inkeys:\n", " continue\n", " \n", " outkeys = sorted(list(set(inkeys)))\n", " outkeys.append(geotag_key)\n", " new_keywords = (split_delimiter+' ').join(outkeys)\n", " outkeys = standard_keywords(new_keywords, split_delimiter=split_delimiter) \n", " same, new_keywords = (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))\n", " if not same:\n", " change_count += 1 \n", " #print(manifest_file)\n", " #print(key, new_keywords)\n", " change_image_keywords(key, new_keywords)\n", " return change_count\n", "\n", "geotag_images('c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt')" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "change count 0\n" ] } ], "source": [ "def set_all_geotags(root):\n", " \"\"\"\n", " Scan all manifest files in local directories and set\n", " geotags for images with nonzero latitude or longitude\n", " that are not geotagged.\n", " \"\"\"\n", " total_changes = 0\n", " pattern = \"manifest-\"\n", " alist_filter = ['txt'] \n", " for r,d,f in os.walk(root):\n", " for file in f:\n", " if file[-3:] in alist_filter and pattern in file:\n", " file_name = os.path.join(root,r,file)\n", " change_count = geotag_images(file_name)\n", " total_changes += change_count\n", " print('change count %s' % total_changes)\n", " \n", "set_all_geotags('c:\\SmugMirror')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting Reverse Geocode Keywords\n", "\n", "As a final example of setting SmugMug keywords let's reverse geocode images with nonzero latitude and longitude.\n", "Reverse geocoding is the dark art of taking a latitude and longitude and turning it into a standard place name.\n", "That evil SJW infested warren of privacy invading weasels known as Google has a free request\n", "limited API that reverse geocodes. You can ping this API a few times without an API key but to do anything\n", "remotely serious you need an API key. API keys's come in two flavors: free and not free. Let's try free first.\n", "\n", "If you obtain a free\n", "[Google Maps API key](https://developers.google.com/maps/documentation/javascript/get-api-key) \n", "you can make 2,500 API calls per day. I currently have roughly a thousand geotagged images \n", "on SmugMug. With a little care I should be able to reverse geocode my images in a day or two.\n", "\n", "Google provides a \n", "[Python Google maps API](https://github.com/googlemaps/google-maps-services-python). I looked over \n", "the code and decided it was overkill. I poked around and found a blog post \n", "[Batch CSV Geocoding in Python with Google Maps API](https://www.shanelynn.ie/batch-geocoding-in-python-with-google-geocoding-api/) \n", "that basically outlines what I want to do here. Shane's post describes geocoding. When geocoding you \n", "supply a place name and turn in into a latitude and longitude. I want the reverse, hence the name \"reverse geocoding.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add Your Maps API Key to the `SmugPyter` Config File\n", "\n", "After getting my Google Maps API key I added it to the `SmugPyter` configuration under a new `[GOOGLEMAPS]` section. " ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'C:\\\\Users\\\\john\\\\.smugpyter.cfg'" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "smugmug.smugmug_config" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "_ = smugmug.google_maps_key" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a Maps API Request" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# set up reverse geocoding test urls\n", "latlng0 = '45.39584,-113.98174' # Idaho\n", "latlng1 = '9.39672,-0.81673' # Iran\n", "latlng2 = '45.35997,-75.71876' # Canada\n", "reverse_geocode_url0 = \"https://maps.googleapis.com/maps/api/geocode/json?latlng={}\".format(latlng0)\n", "reverse_geocode_url1 = \"https://maps.googleapis.com/maps/api/geocode/json?latlng={}\".format(latlng1)\n", "reverse_geocode_url2 = \"https://maps.googleapis.com/maps/api/geocode/json?latlng={}\".format(latlng2)\n", "if smugmug.google_maps_key is not None:\n", " reverse_geocode_url0 = reverse_geocode_url0 + \"&key={}\".format(smugmug.google_maps_key)\n", " reverse_geocode_url1 = reverse_geocode_url1 + \"&key={}\".format(smugmug.google_maps_key)\n", " reverse_geocode_url2 = reverse_geocode_url2 + \"&key={}\".format(smugmug.google_maps_key)\n", "\n", "#print(reverse_geocode_url0)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# ping google - remember you only get 2,500 freebies per day.\n", "results0 = requests.get(reverse_geocode_url0)\n", "results0 = results0.json()\n", "results1 = requests.get(reverse_geocode_url1)\n", "results1 = results1.json()\n", "results2 = requests.get(reverse_geocode_url2)\n", "results2 = results2.json()" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'address_components': [{'long_name': '1952-1982',\n", " 'short_name': '1952-1982',\n", " 'types': ['street_number']},\n", " {'long_name': 'Casey Road', 'short_name': 'US-93', 'types': ['route']},\n", " {'long_name': 'Salmon',\n", " 'short_name': 'Salmon',\n", " 'types': ['locality', 'political']},\n", " {'long_name': 'Lemhi County',\n", " 'short_name': 'Lemhi County',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'long_name': 'Idaho',\n", " 'short_name': 'ID',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']},\n", " {'long_name': '83467', 'short_name': '83467', 'types': ['postal_code']}],\n", " 'formatted_address': '1952-1982 US-93, Salmon, ID 83467, USA',\n", " 'geometry': {'bounds': {'northeast': {'lat': 45.3987156,\n", " 'lng': -113.9803244},\n", " 'southwest': {'lat': 45.39577569999999, 'lng': -113.9847687}},\n", " 'location': {'lat': 45.396023, 'lng': -113.9815654},\n", " 'location_type': 'RANGE_INTERPOLATED',\n", " 'viewport': {'northeast': {'lat': 45.3987156, 'lng': -113.9803244},\n", " 'southwest': {'lat': 45.39577569999999, 'lng': -113.9847687}}},\n", " 'place_id': 'EikxOTUyLTE5ODIgQ2FzZXkgUmQsIFNhbG1vbiwgSUQgODM0NjcsIFVTQQ',\n", " 'types': ['street_address']},\n", " {'address_components': [{'long_name': '83467',\n", " 'short_name': '83467',\n", " 'types': ['postal_code']},\n", " {'long_name': 'Salmon',\n", " 'short_name': 'Salmon',\n", " 'types': ['locality', 'political']},\n", " {'long_name': 'Lemhi County',\n", " 'short_name': 'Lemhi County',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'long_name': 'Idaho',\n", " 'short_name': 'ID',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']}],\n", " 'formatted_address': 'Salmon, ID 83467, USA',\n", " 'geometry': {'bounds': {'northeast': {'lat': 45.4116059,\n", " 'lng': -113.4491199},\n", " 'southwest': {'lat': 44.69231389999999, 'lng': -114.2212051}},\n", " 'location': {'lat': 44.9479845, 'lng': -113.9660111},\n", " 'location_type': 'APPROXIMATE',\n", " 'viewport': {'northeast': {'lat': 45.4116059, 'lng': -113.4491199},\n", " 'southwest': {'lat': 44.69231389999999, 'lng': -114.2212051}}},\n", " 'place_id': 'ChIJORdJq_krWFMRJ_DosdS1_KQ',\n", " 'types': ['postal_code']},\n", " {'address_components': [{'long_name': 'Lemhi County',\n", " 'short_name': 'Lemhi County',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'long_name': 'Idaho',\n", " 'short_name': 'ID',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']}],\n", " 'formatted_address': 'Lemhi County, ID, USA',\n", " 'geometry': {'bounds': {'northeast': {'lat': 45.705883, 'lng': -112.813604},\n", " 'southwest': {'lat': 44.230235, 'lng': -114.8201151}},\n", " 'location': {'lat': 45.0364592, 'lng': -113.9230554},\n", " 'location_type': 'APPROXIMATE',\n", " 'viewport': {'northeast': {'lat': 45.705883, 'lng': -112.813604},\n", " 'southwest': {'lat': 44.230235, 'lng': -114.8201151}}},\n", " 'place_id': 'ChIJ0792dKEnWFMR9q9wjunaUDo',\n", " 'types': ['administrative_area_level_2', 'political']},\n", " {'address_components': [{'long_name': 'Idaho',\n", " 'short_name': 'ID',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']}],\n", " 'formatted_address': 'Idaho, USA',\n", " 'geometry': {'bounds': {'northeast': {'lat': 49.0011461, 'lng': -111.043495},\n", " 'southwest': {'lat': 41.9880051, 'lng': -117.243027}},\n", " 'location': {'lat': 44.0682019, 'lng': -114.7420408},\n", " 'location_type': 'APPROXIMATE',\n", " 'viewport': {'northeast': {'lat': 49.0011461, 'lng': -111.043495},\n", " 'southwest': {'lat': 41.9880051, 'lng': -117.243027}}},\n", " 'place_id': 'ChIJ6Znkhaj_WFMRWIf3FQUwa9A',\n", " 'types': ['administrative_area_level_1', 'political']},\n", " {'address_components': [{'long_name': 'United States',\n", " 'short_name': 'US',\n", " 'types': ['country', 'political']}],\n", " 'formatted_address': 'United States',\n", " 'geometry': {'bounds': {'northeast': {'lat': 71.5388001, 'lng': -66.885417},\n", " 'southwest': {'lat': 18.7763, 'lng': 170.5957}},\n", " 'location': {'lat': 37.09024, 'lng': -95.712891},\n", " 'location_type': 'APPROXIMATE',\n", " 'viewport': {'northeast': {'lat': 49.38, 'lng': -66.94},\n", " 'southwest': {'lat': 25.82, 'lng': -124.39}}},\n", " 'place_id': 'ChIJCzYy5IS16lQRQrfeQ5K5Oxw',\n", " 'types': ['country', 'political']}]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results0[\"results\"]" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Idaho, USA\n", "Northern Region, Ghana\n", "Ontario, Canada\n" ] } ], "source": [ "# extract only state or province (admistrative level 1) and country\n", "if results0[\"status\"] == \"OK\" and results1[\"status\"] == \"OK\" and results2[\"status\"] == \"OK\":\n", " print(results0[\"results\"][-2]['formatted_address'])\n", " print(results1[\"results\"][-2]['formatted_address'])\n", " print(results2[\"results\"][-2]['formatted_address'])" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['ontario', 'canada']\n" ] } ], "source": [ "state_country = results2[\"results\"][-2]['formatted_address']\n", "reverse_keys = [s.strip().lower() for s in state_country.split(',')]\n", "print(reverse_keys)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2, ['idaho', 'usa'])\n", "(1, ['usa'])\n" ] } ], "source": [ "def reverse_geocode(latitude, longitude):\n", " \"\"\"\n", " Returns state or province and country keywords from latitude and longitude.\n", " \"\"\"\n", " count_reverse_codes = (0, [])\n", " latlng = '%s,%s' % (latitude, longitude)\n", " reverse_geocode_url = \"https://maps.googleapis.com/maps/api/geocode/json?latlng=%s&key=%s\"\n", " reverse_geocode_url = reverse_geocode_url % (latlng, smugmug.google_maps_key)\n", " results = requests.get(reverse_geocode_url)\n", " results = results.json()\n", " \n", " if results[\"status\"] == \"OK\":\n", " try:\n", " state_country = results[\"results\"][-2]['formatted_address']\n", " reverse_keys = standard_keywords(state_country, split_delimiter=',')\n", " count_reverse_codes = (len(reverse_keys), reverse_keys)\n", " except Exception as e:\n", " # ignore any errors - no reverse geocodes for you\n", " count_reverse_codes = (0, [])\n", " print('unable to reverse geocode %s' % latlng)\n", " \n", " return count_reverse_codes\n", "\n", "print(reverse_geocode(45.39584,-113.98174))\n", "print(reverse_geocode(40.76814,-111.88988)) # some usa locations report united_states - remap to usa" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def reverse_geocode_images(manifest_file, *, split_delimiter=';', geotag_key='geotagged'):\n", " \"\"\"\n", " Reverse geocode images with nonzero latitude and longitude.\n", " \"\"\"\n", " change_count = 0\n", " with open(manifest_file, 'r') as f:\n", " reader = csv.DictReader(f, dialect='excel', delimiter='\\t') \n", " for row in reader:\n", " key = row['ImageKey']\n", " latitude = float(row['Latitude'])\n", " longitude = float(row['Longitude'])\n", " if latitude != 0.0 or longitude != 0.0:\n", " keywords = row['Keywords']\n", " inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]\n", " \n", " # if an image is already geotagged skip it - edit the\n", " # changes file and strip (geotag_key) to reprocess\n", " if geotag_key in inkeys:\n", " continue\n", " \n", " reverse_count , reverse_keywords = reverse_geocode(latitude, longitude)\n", " if reverse_count == 0:\n", " continue\n", " else: \n", " outkeys = inkeys + reverse_keywords\n", " outkeys.append(geotag_key)\n", " outkeys = sorted(list(set(outkeys)))\n", " new_keywords = (split_delimiter+' ').join(outkeys)\n", " outkeys = standard_keywords(new_keywords, split_delimiter=split_delimiter) \n", " same, new_keywords = (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))\n", " if not same:\n", " print(reverse_keywords)\n", " change_count += 1 \n", " change_image_keywords(key, new_keywords)\n", " return change_count\n", "\n", "#reverse_geocode_images('c:\\SmugMirror\\Places\\Overseas\\Ghana1970s\\manifest-Ghana1970s-Kng6tg-w.txt')" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "change count 0\n" ] } ], "source": [ " def set_all_reverse_geocodes(root):\n", " \"\"\"\n", " Scan all manifest files in local directories and set\n", " reverse geocode keys for nongeotagged images with nonzero\n", " latitude or longitude.\n", " \n", " Note: limited to 2,500 free Google geocode API calls per day.\n", " \"\"\"\n", " total_changes = 0\n", " pattern = \"manifest-\"\n", " alist_filter = ['txt'] \n", " for r,d,f in os.walk(root):\n", " for file in f:\n", " if file[-3:] in alist_filter and pattern in file:\n", " change_count = reverse_geocode_images(os.path.join(root,r,file))\n", " total_changes += change_count\n", " print('change count %s' % total_changes)\n", " \n", "set_all_reverse_geocodes('c:\\SmugMirror')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Next on the Agenda!\n", "\n", "Now that I have worked through a proof of concept the next notebook will condense and refine the code in this notebook into a `SmugPyter` print size keyword setting subclass.\n", "\n", "Remember, always [Analyze the Data not the Drivel](https://analyzethedatanotthedrivel.org/).\n", "\n", "John Baker, Meridian Idaho" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }