{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Single Exposure Processing\n",
    "\n",
    "This is intended to walk you through the processing pipeline on jupyterlab. It builds on the first two hands-on tutorials in the LSST [\"Getting started\" tutorial series](https://pipelines.lsst.io/getting-started/index.html#getting-started-tutorial). It is intended for anyone getting started with using the LSST Science Pipelines for data processing. \n",
    "\n",
    "The goal of this tutorial is to setup a Butler for a simulated LSST data set and to run the `processCCD.py` pipeline task to produced reduced images."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up the data repository\n",
    "\n",
    "Sample data for this tutorial comes from the `twinkles` LSST simulation and is available in a shared directory on `jupyterlab`. We will make a copy of the input data in our current directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!if [ ! -d DATA ]; then cp -r /project/shared/data/Twinkles_subset/input_data_v2 DATA; fi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Inside the data directory you'll see a directory structure that looks like this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls -lh DATA/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Butler uses a mapper to find and organize data in a format specific to each camera. Here we're using `lsst.obs.lsstSim.LsstSimMapper` mapper for the Twinkles simulated data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cat DATA/_mapper"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All of the relavent images and calibrations have already been ingested into the Butler for this data set."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reviewing what data will be processed\n",
    "\n",
    "We'll now process individual raw LSST simulated images in the Butler `DATA` repository into calibrated exposures. We’ll use the `processCcd.py` command-line task to remove instrumental signatures with dark, bias and flat field calibration images. `processCcd.py` will also use the reference catalog to establish a preliminary WCS and photometric zeropoint solution.\n",
    "\n",
    "First we'll examine the set of exposures available in the Twinkles data set using the Butler"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we'll do a similar thing using the `processEimageTask` from the LSST pipeline. **There is a bit of ugliness here because the `processEimage.py` command line script is only python2 compatible so we need to parse the arguments through the API. This has the nasty habit of trying to exit after the args.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from lsst.obs.lsstSim.processEimage import ProcessEimageTask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "args = 'DATA --rerun process-eimage  --id filter=r --show data'\n",
    "ProcessEimageTask.parseAndRun(args=args.split())\n",
    "\n",
    "# BUG: the command above exits early, due to a namespace problem:\n",
    "# /opt/lsst/software/stack/stack/miniconda3-4.3.21-10a4fa6/Linux64/pipe_base/15.0/python/lsst/pipe/base/argumentParser.py in parse_args(self, config, args, log, override)\n",
    "#     628 \n",
    "#     629         if namespace.show and \"run\" not in namespace.show:\n",
    "# --> 630             sys.exit(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The important arguments here are `--id` and `--show data`.\n",
    "\n",
    "The `--id` argument allows you to select datasets to process by their data IDs. Data IDs describe individual datasets in the Butler repository. Datasets also have types, and each command-line task will only process data of certain types. In this case, `processEimage.py` will processes raw simulated e-images **(need more description of e-images)**.\n",
    "\n",
    "In the above command, the `--id filter=r` argument selects data from the r filter. Specifying `--id` without any arguments acts as a wildcard that selects all raw-type data in the repository.\n",
    "\n",
    "The `--show data` argument puts `processEimage.py` into a dry-run mode that prints a list of data IDs to standard output that would be processed according to the `--id` argument rather than actually processing the data. \n",
    "\n",
    "Notice the keys that describe each data ID, such as the visit (exposure identifier), raft (identifies a specific LSST camera raft), sensor (identifies an individual ccd on a raft) and filter, among others. With these keys you can select exactly what data you want to process."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we perform the same task directly with the Butler:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import lsst.daf.persistence as dafPersist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "butler = dafPersist.Butler(inputs='DATA')\n",
    "butler.queryMetadata('eimage', ['visit', 'raft', 'sensor','filter'], dataId={'filter': 'r'})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Processing data\n",
    "\n",
    "Now we'll move on to actually process some of the Twinkles data. To do this, we'll remove the `--show data` argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "args = 'DATA --rerun process-eimage  --id filter=r --show data'\n",
    "# The command below also exits early - see the error message above.\n",
    "ProcessEimageTask.parseAndRun(args=args.split())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "LSST_Stack (Python 3)",
   "language": "python",
   "name": "lsst_stack"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}