{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Downloading SmugMug Captions with Python and Jupyter\n", "=========================================\n", "\n", "![](jupysmug.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequistes\n", "\n", "This notebook assumes you have set up your environment to use `smugpyter.py`. \n", "Refer to this notebook for details on how to do this.\n", "\n", "[Getting Ready to use the SmugMug API with Python and Jupyter](https://github.com/bakerjd99/smugpyter/blob/master/notebooks/Getting%20Ready%20to%20use%20the%20SmugMug%20API%20with%20Python%20and%20Jupyter.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why am I doing this?\n", "My [photo captions](https://conceptcontrol.smugmug.com) have\n", "evolved into a form of *milliblogging*. *Milliposts* (milliblog posts) are terse and\n", "tiny; many are single sentences or paragraphs. Taken one-at-a-time\n", "milliposts seldom impress but when gathered in hundreds or\n", "thousands accidental epics emerge. So, to prevent \"epic loss\" I want\n", "a simple way of downloading and archiving my captions off-line.\n", "\n", "## If you don't control it you cannot trust it!\n", "When I started [blogging](https://analyzethedatanotthedrivel.org) I knew\n", "that you could not depend on blogging websites to archive and\n", "preserve your documents. We had already seen cases of websites mangling content, shutting\n", "down without warning, and even worse, *censoring* bloggers. It was a classic case\n", "of, “If you don't control it you cannot trust it.\" I resolved to maintain complete off-line *version\n", "controlled* copies of my blog posts.\n", "\n", "Maintaining off-line copies was made easier by\n", "[WordPress.com](https://wordpress.com/)'s excellent [blog export\n", "utility](https://en.blog.wordpress.com/2006/06/12/xml-import-export/). A\n", "simple button push downloads a large `XML` file that contains all your\n", "blog posts with embedded references to images and other inclusions. `XML`\n", "is not my preferred archive format. I am a huge fan of `LaTeX` and\n", "`Markdown`: two text formats that are directly supported in Jupyter\n", "Notebooks. I wrote a little system that parses the WordPress `XML` file and [generates\n", "LaTeX and Markdown](https://analyzethedatanotthedrivel.org/2012/02/11/wordpress-to-latex-with-pandoc-and-j-prerequisites-part-1/) files. Yet, despite milliblogging long before blogging, I don't have a similar system for\n", "downloading and archiving Smugmug *metadata*. This Jupyter notebook addresses this omission \n", "and shows how you can use Python and the Smugmug API to extract gallery \n", "and image metadata and store it in version controlled local directories as `CSV` files." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## *smugpyter.py* runs in the Python 3.6 to 3.9/Jupyter/Win64 environment.\n", "\n", "A lot of the code in this notebook was derived from:\n", "\n", "1. [https://github.com/marekrei/smuploader/blob/master/smuploader/smugmug.py](https://github.com/marekrei/smuploader/blob/master/smuploader/smugmug.py)\n", "\n", "2. [https://github.com/kevinlester/smugmug_download](https://github.com/kevinlester/smugmug_download)\n", "\n", "3. [https://github.com/AndrewsOR/MugMatch](https://github.com/AndrewsOR/MugMatch) \n", "\n", "The originals did not run in the Python 3.6/Jupyter/Win64 environment and lacked some of \n", "the facilities I wanted so I adjusted, tweaked and modified the scripts. The result\n", "is incompatible with the originals so I renamed the main class *SmugPyter* to avoid confusion. \n", "Finally, being new to Python and Jupyter, I used the [2to3](http://pythonconverter.com/) tool to help make the changes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import sys\n", "import requests\n", "sys.path.append(r'C:\\mp\\jupyter\\smugpyter\\notebooks')\n", "import smugpyter\n", "\n", "help(smugpyter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create the *SmugPyter* configuration file.\n", "\n", "The `SmugPyter` class constuctor reads a config file. If this file is missing \n", "you cannot create instances of the SmugMug class or connect to your SmugMug account. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the SmugPyter class constuctor reads a config file in this location.\n", "os.path.join(os.path.expanduser(\"~\"), '.smugpyter.cfg')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following prompts for your SmugMug API keys. You can apply for SmugMug keys on your SmugMug account by browsing to the API KEYS section of your account settings." ] }, { "cell_type": "raw", "metadata": {}, "source": [ "# code from https://github.com/speedenator/smuploader/blob/master/bin/smregister\n", "# modified for python 3.6/jupyter environment - modifications assisted by 2to3 tool \n", "\n", "# NOTE: this code cell has been turned into Raw NBConvert to prevent accidental execution\n", "# if you need to run this code turn this cell back into code.\n", "\n", "from rauth.service import OAuth1Service\n", "import requests\n", "import http.client\n", "import httplib2\n", "import hashlib\n", "import urllib.request, urllib.parse, urllib.error\n", "import time\n", "import sys\n", "import os\n", "import json\n", "import configparser\n", "import re\n", "import shutil\n", "\n", "# depends on previously run cells \n", "#from smuploader import SmugMug\n", "\n", "def write_config(configfile, params):\n", " config = configparser.ConfigParser()\n", " config.add_section('SMUGMUG')\n", " for key, value in params:\n", " config.set('SMUGMUG', key, value)\n", " with open(SmugMug.smugmug_config, 'w') as f:\n", " config.write(f)\n", "\n", "if __name__ == '__main__':\n", " print(\"\\n\\n\\n#######################################################\")\n", " print(\"## Welcome! \")\n", " print(\"## We are going to go through some steps to set up this SmugMug photo manager and make it connect to the API.\")\n", " print(\"## Step 0: What is your SmugMug username?\")\n", " username = input(\"Username: \")\n", " print('## Step 1: Enter your local directory, e.g. c:/SmugMirror/')\n", " localdir = input(\"Directory: \")\n", "\n", " print(\"## Step 2: Go to https://api.smugmug.com/api/developer/apply and apply for an API key.\")\n", " print(\"## This gives you unique identifiers for connecting to SmugMug.\")\n", " print(\"## When done, you can find the API keys in your SmugMug profile.\")\n", " print(\"## Account Settings -> Me -> API Keys\")\n", " print((\"## Enter them here and they will be saved to the config file (\" + SmugMug.smugmug_config + \") for later use.\"))\n", " consumer_key = input(\"Key: \")\n", " consumer_secret = input(\"Secret: \")\n", "\n", " write_config(SmugMug.smugmug_config, [(\"username\", username), (\"consumer_key\", consumer_key), \n", " (\"consumer_secret\", consumer_secret), (\"access_token\", ''), \n", " (\"access_token_secret\", '')])\n", "\n", " smugmug = SmugMug()\n", " authorize_url = smugmug.get_authorize_url()\n", " print((\"## Step 2: Visit this address in your browser to authenticate your new keys for access your SmugMug account: \\n## \" + authorize_url))\n", " print(\"## After that, enter the 6-digit key that SmugMug provided\")\n", " verifier = input(\"6-digit key: \")\n", "\n", " access_token, access_token_secret = smugmug.get_access_token(verifier)\n", "\n", " write_config(SmugMug.smugmug_config, [(\"username\", username), (\"consumer_key\", consumer_key), \n", " (\"consumer_secret\", consumer_secret), (\"access_token\", access_token),\n", " (\"access_token_secret\", access_token_secret)])\n", "\n", " print(\"## Great! All done!\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Try out the *SmugPyter* class with credentials saved in the previous cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smugmug = smugpyter.SmugPyter()\n", "len(smugmug.get_album_names())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(smugmug)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smugmug.get_albums()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smugmug.get_folders()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "caught_my_eye = smugmug.get_album_id('Caught My Eye')\n", "forebearers = smugmug.get_album_id('Great and Greater Forebearers')\n", "idaho_instants = smugmug.get_album_id(\"Idaho Instants\")\n", "cell_phoning = smugmug.get_album_id(\"Cell Phoning It In\")\n", "[caught_my_eye, forebearers, idaho_instants, cell_phoning]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smugmug.get_album_info(caught_my_eye)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "album_images = smugmug.get_album_images(forebearers)\n", "len(album_images)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "album_captions = smugmug.get_album_image_captions(album_images)\n", "album_latitude_longitude = smugmug.get_latitude_longitude_altitude(album_images)\n", "album_latitude_longitude" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "album_real_dates = smugmug.get_album_image_real_dates(album_images)\n", "album_real_dates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try out other \"unsupported\" version 2.0 API calls. Documentation for SmugMug Version 2.0 API calls is best obtained by hacking with SmugMug's live API browser tool at:\n", "\n", "[https://api.smugmug.com/api/v2](https://api.smugmug.com/api/v2)\n", "\n", "The live API tool is far more useful if you log into your SmugMug account and point at your own images." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Walk SmugMug folders and albums and download coveted metadata.\n", "\n", "The next cell calls the main function that walks SmugMug folders and writes metadata to local directories. Metadata is saved in `TAB` delimited `CSV` manifest files. `TAB` delimited files are also called `TSV` files. The function writes one file per album. If local directories do not exist they are created. If manifest files already exist they are are overwritten. The entire SmugMug tree is walked. You might want to adjust where the walk starts if you have hundreds or thousands of albums.\n", "\n", "Manifest files follow this naming convention.\n", "\n", " manifest---.txt\n", " \n", "Here are some examples:\n", " \n", " manifest-ZambiaEclipseTrip-k65QRs-6.txt\n", " manifest-FromHazelsAlbums-FZK4j4-1k.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "smugmug = smugpyter.SmugPyter()\n", "\n", "smugmug.download_smugmug_mirror(func_album=smugmug.write_album_manifest)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Next on the Agenda!\n", "\n", "Remember how \"no good dead goes unpunished.\" Well, running code will be \"enhanced\" whether it's necessary or not. \n", "Now that I have a local directories that contain relevant SmugMug metadata in an easily consumed `CSV` form other notebooks will use these directories to generate what I call \"long duration documents.\" My prefered long duration sources are `Markdown` and `LaTex`. Both of these text formats will be readable for centuries if printed on high quality acid free paper and stored in numerous [\"secure and undisclosed locations.\"](https://www.pinterest.com/wisefoodstorage/bunkers-worth-pinning/)\n", "\n", "Remember, always [Analyze the Data not the Drivel](https://analyzethedatanotthedrivel.org/).\n", "\n", "John Baker, Meridian Idaho, January 31, 2022" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 2 }