{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from glob import glob\n",
    "import os \n",
    "\n",
    "import ipywidgets as widgets\n",
    "wstyle = {'description_width': 'initial'}\n",
    "\n",
    "\n",
    "import pandas as pd\n",
    "from datetime import datetime, timedelta\n",
    "\n",
    "import requests\n",
    "import s3fs\n",
    "import boto3\n",
    "\n",
    "from rasterio.plot import show\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "from random import sample\n",
    "collectFootprint = None\n",
    "collectDate = None\n",
    "\n",
    "\n",
    "\n",
    "from retrying import retry\n",
    "os.environ['CURL_CA_BUNDLE']='/etc/ssl/certs/ca-certificates.crt'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Step 2: Planet Labs Imagery Ordering 🛰 🖼\n",
    "\n",
    "🚨**WARNING**🚨: The operations in this notebook interact with Planet Labs via a programmatic interface and **therefore count against your Planet Quota**! Be sure you know what this means for your individual Planet Labs access. \n",
    "\n",
    "This notebook describes ordering imagery from Planet Labs overlapping an area of interest. We'll make use of the [`porder`](https://github.com/samapriya/porder) command-line tool. \n",
    "\n",
    "Of course, it is necessary for this step that you have an API key with Planet Labs. To begin this process, check out the [Planet Education and Research Program](https://www.planet.com/markets/education-and-research/)\n",
    "\n",
    "⚠️**Note**⚠️: In order to run these commands in this notebook, `porder` requires a `.planet.json` file in your home directory with the following format: \n",
    "\n",
    "```\n",
    "{\n",
    "    \"key\" : \"<planet API key>\"\n",
    "}\n",
    "```\n",
    "\n",
    "If you have the [Planet CLI](https://planetlabs.github.io/planet-client-python/cli/index.html) installed, you can run `planet init` to place this file in the correct place for `porder`.\n",
    "\n",
    "## Note: Multi-Use Notebook\n",
    "The primary utility of this notebook is to download Planet Labs imagery to accompany the ASO snow mask for training a fresh ML model. However, you can also use this notebook to download Planet imagery for **any** area of interest, for model evaluation or dataset production. Simply specify a different GeoJSON file than the default ASO footprint, as outlined below. \n",
    "\n",
    "## ASO Collect Selection \n",
    "*If you're just using this notebook to download Planet imagery to run through the model, you can skip the next few steps*.\n",
    "\n",
    "In the space below, provide a directory to search for ASO collect `.tif` files. This is the same place where the **Step 1** notebook stored the raw downloaded ASO Collects. We'll use this information to find the GeoJSON footprint of the given collect. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "imgDir = widgets.Text(description=\"ASO Location\", style = wstyle)\n",
    "imgDir"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now select an ASO collect: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not imgDir.value:\n",
    "    imgDir.value = \"/tmp/work\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "collect = widgets.Select(\n",
    "    options = glob(imgDir.value + \"/*.tif\")\n",
    ")\n",
    "collect"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not collect.value:\n",
    "    collect.value = ' ASO_3M_SD_USCOGE_20180524.tif'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll check to be sure that this selected image has a corresponding GeoJSON file. \n",
    "\n",
    "⚠️ If the below step fails, verify in **Step 1** that the \"*Preprocess for Tiling*\" step completed without errors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "collectFootprint = os.path.splitext(collect.value)[0]+'.geojson'\n",
    "collectDateStr = os.path.basename(collect.value).split('_')[-1].split(\".\")[0]\n",
    "collectDate = datetime.strptime(collectDateStr, \"%Y%m%d\")\n",
    "assert os.path.exists(collectFootprint), \"corresponding geojson file not found!\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## GeoJSON Specification\n",
    "*If you're using this notebook independently of model training **here** is where you'll put your GeoJSON for your area of interest*. \n",
    "\n",
    "Here we'll specify the Area of Interest (AOI) in GeoJSON format that Planet will use to search for imagery on our behalf. If you've already done the above steps, you'll see the below text box is already filled out. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "AOIPath = widgets.Text(description=\"AOI Path\", value = collectFootprint,  style = wstyle)\n",
    "AOIPath"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Date Range Search \n",
    "\n",
    "Here you can choose some parameters around the temporal search window for your Planet imagery. All you need to do is select a **central date** and a **window size (days)**, and we'll compute the start and end dates for the search query. \n",
    "\n",
    "*If you're using an ASO collect, the central date will be filled for you*. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "centralDate = widgets.DatePicker(\n",
    "    description='Central Date',\n",
    "    disabled=False, \n",
    "    value = collectDate\n",
    ")\n",
    "dateWindow = widgets.IntText(description = 'Window Size', value = 1)\n",
    "widgets.VBox([centralDate, dateWindow],  style = wstyle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "searchBuffer = timedelta(days = date)\n",
    "outputFormat = \"%Y-%m-%d\"\n",
    "searchStartDate = datetime.strftime(centralDate.value - searchBuffer, outputFormat) \n",
    "searchEndDate = datetime.strftime(centralDate.value + searchBuffer, outputFormat)\n",
    "print(searchStartDate, centralDate.value, searchEndDate, sep='\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<span><h2>Imagery Search 🔎</h2></span>\n",
    "\n",
    "We can now use `porder`'s `idlist` function to search for Planet Asset Ids which match our specifications, which are that the images: \n",
    "\n",
    "* Overlaps with an ASO Collect in space\n",
    "* Was taken within 2-3 days of collection \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$AOIPath.value\" \"$searchStartDate\" \"$searchEndDate\" \"$imgDir.value\"\n",
    "\n",
    "porder idlist --input $1 --start $2 --end $3 --asset analytic_sr --item PSScene4Band --outfile $4/planet_ids.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There's now a file in the directory specified above called `planet_ids.csv` containing IDs of images found to satisfy these constraints. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "idListFile = os.path.join(imgDir.value, 'planet_ids.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.read_csv(idListFile, header=None)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imagery Ordering 👆\n",
    "\n",
    "Now that these image IDs have been identified, we can submit these IDs to the Planet Orders API to be clipped to our specified area of interest (the ASO collect footprint) and delivered. For the purposes of this analysis we use the direct AWS S3 delivery option, this is optional. (If you don't use that option, you'll be given a link to check when your clipping operation is finished, at which point you can download the new assets. The `email` option of `porder` will send an e-mail when this is finished, as well). \n",
    "\n",
    "**🚨FINAL WARNING🚨**: *any orders submitted this way **will count against your quota**.* Check your Planet quotas via `porder quota`.\n",
    "\n",
    "In order to use the AWS functionality, you must create a credentials file in the following way:\n",
    "```\n",
    "amazon_s3:\n",
    "    bucket: \"<bucket name, e.g. planet-orders>\" \n",
    "    aws_region: \"<region name, e.g. us-west-2>\"\n",
    "    aws_access_key_id: \"<AWS Access key>\"\n",
    "    aws_secret_access_key: \"<AWS Secret>\"\n",
    "    path_prefix: \"<bucket prefix>\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "AWSCredFile = \"/home/ubuntu/aws-cred.yml\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To order the imagery, we use the `porder order` functionality. This requires an order name, which we generate from the ASO collect filename and today's date.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "orderName = \"ORDER-{}-{}\".format(os.path.basename(collect.value).split('.')[0], datetime.strftime(datetime.now(), \"%Y%m%d-%H%M\"))\n",
    "orderName"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$orderName\" \"$AOIPath.value\" \"$idListFile\" \"$AWSCredFile\"\n",
    "\n",
    "porder order --name $1 --item PSScene4Band --bundle analytic_sr --boundary $2 --idlist $3 --aws $4 --op clip email aws "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This URL above is the order reference, and can be queried (if the `email` operation was chosen, an e-mail will be sent when the operation is finished as well). \n",
    "\n",
    "To query the URL endpoint, provide your Planet username and password below, and run the adjacent cells. (This private information isn't available anywhere other than in your computer's temporary memory and will be deleted when Python stops running). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "username = widgets.Text(description = \"username\")\n",
    "password = widgets.Password(description = \"password\")\n",
    "orderUrl = widgets.Text(description = \"Order URL\")\n",
    "widgets.VBox([username, password, orderUrl], box_style = 'info')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "@retry(stop_max_delay = 20 * 60 * 60 , wait_fixed = 10000)\n",
    "def checkOrderStatus():\n",
    "    with requests.Session() as session:\n",
    "        session.auth = (username.value, password.value)    \n",
    "        r = session.get(orderUrl.value).json()\n",
    "\n",
    "    return(r)\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When the above function returns `results`, we've got images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataFiles = checkOrderStatus()['_links']['results']\n",
    "dataFiles"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From these we extract the image files, which we'll turn into tiles. \n",
    "\n",
    "## Image Tiling to Cloud Storage\n",
    "\n",
    "At this point, since we specified the AWS delivery option, the raw Planet Assets are stored in S3. Next, we'll tile these images to that same bucket in S3, which we'll have to specify again below. \n",
    "\n",
    "For each image requested from the Planet Orders API, three objects are returned: \n",
    "\n",
    "1. A Metadata file (`.json`)\n",
    "2. A Usable Data Mask (`*_DN_udm_clip.tif`)\n",
    "3. The clipped Image (`*_SR_clip.tif`)\n",
    "\n",
    "We'll get the paths for the clipped images (`*_SR_clip.tif`) and send them to our `preprocess` module for tiling. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "images = [_i['name'] for _i in dataFiles if _i['name'].endswith('_SR_clip.tif')]\n",
    "images"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, specify the AWS S3 bucket you'd like the image tiles stored in (usually, its the same as the one the raw images are currently stored in). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "s3Bucket = widgets.Text(description=\"S3 Bucket\")\n",
    "s3Bucket"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.chdir(\"/home/ubuntu/planet-snowcover/\")\n",
    "for image in images:\n",
    "    imageLocal = os.path.basename(image)\n",
    "    ! aws s3 cp --profile esip {'s3://planet-snowcover-imagery/'+image} /tmp/{imageLocal}\n",
    "    ! /home/ubuntu/anaconda3/envs/pytorch_p36/bin/python -m preprocess tile --zoom 15 --indexes 1 2 3 4 --quant 10000 --aws_profile esip --skip-blanks s3://{s3Bucket.value} /tmp/{imageLocal} \n",
    "    ! rm /tmp/{imageLocal}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see how that panned out. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Check some tiles\n",
    "import rasterio as rio\n",
    "fs = s3fs.S3FileSystem(session= boto3.Session(profile_name = 'esip'))\n",
    "\n",
    "for image in images:\n",
    "    imageId = os.path.basename(image).split('.')[0]\n",
    "    tiles = fs.walk('planet-snowcover-imagery/{}'.format(imageId))\n",
    "    print(len(tiles))\n",
    "    \n",
    "    sampTiles = sample(tiles, 10)\n",
    "    fig = plt.figure(figsize = (10, 10), dpi = 100)\n",
    "    grid = plt.GridSpec(5, 3)\n",
    "    plt.suptitle(imageId)\n",
    "\n",
    "    with rio.Env(profile_name='esip'):\n",
    "        for i, t in enumerate(sampTiles):\n",
    "            tile = '/'.join(t.split('/')[-3:])\n",
    "            ax = plt.subplot(grid[i])\n",
    "            ax.set_title(tile)\n",
    "            ax.axis('equal')\n",
    "            ax.get_xaxis().set_visible(False)\n",
    "            ax.get_yaxis().set_visible(False)\n",
    "            show(rio.open('s3://' + t).read(4), cmap='binary_r', ax = ax)\n",
    "        \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## In Summary\n",
    "\n",
    "We've started with an ASO collect, identified relevant Planet assets, ordered those clipped assets from Planet, and processed them into tiles that live on Amazon S3. Below are the S3 folders that contain these tiled assets for reference during model training. We'll put them in a file. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assets =[\"s3://planet-snowcover-imagery/{}/\".format(os.path.basename(image).split('.')[0]) for image in images]\n",
    "assets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pd.Series(assets).to_csv('/tmp/work/assets.csv', index = False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat /tmp/work/assets.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a reminder, these images are overlapping with the following ASO collect:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "collect.value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Running the below cell will determine the corresponding ASO tile locations, if any. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "collectTiles = 'planet-snowcover-snow/{}_binary'.format(os.path.splitext(os.path.basename(collect.value))[0])\n",
    "print(\"Tiles located at \\\"{}\\\"\".format(collectTiles)) if fs.exists(collectTiles) else \"tiles not present on S3\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This ASO tile location, paired with the above imagery tile locations, is sufficient information to train the snow identification model. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    " ---\n",
    "by Tony Cannistra.\n",
    "\n",
    "© University of Washington, 2019.\n",
    "\n",
    "Support from the National Science Foundation, NASA THP, Earth Science Information partners, and the UW eScience Institute.</small>\n",
    "\n",
    "<div style=\"text-align:center\" markdown=\"1\">\n",
    "<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/3/39/Planet_logo_New.png/320px-Planet_logo_New.png\" width=100/>\n",
    "</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Environment (conda_pytorch_p36)",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}