{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extract multiple invoices with reports from Chrome River\n",
    "**Author**: Aaron MacGillivary (aaron@junctionapps.ca)\n",
    "\n",
    "**Provided as a demo without any warranty of any kind, use at own risk**\n",
    "\n",
    "Asked by Amy Moffitt in the forums, quick Python script provided as demo to show how one might accomplish this in Python. Amy's question was how to get all in a batch, but I'm uncertain if there is a way to get all the id's from a batch via Chrome River's api's. However, parsing an export file is done easily enough.\n",
    "\n",
    "This is a proof of concept, you'd want to automate this with naming the output file appropriately, potentially emailing the resulting pdf to someone or whatever suits your business.\n",
    "\n",
    "For this to work as written, you'll need to create a file called `cr_creds.py` saved in the same location as this notebook. It must be of the format:\n",
    "\n",
    "```python\n",
    "api_user = 'youraccountusername'\n",
    "api_pass = 'goodmixoflettersnumbers'\n",
    "api_base_url = 'servername_supplied_by_cr.chromeriver.com'\n",
    "```\n",
    "\n",
    "### Configurable options:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "export_file_name = \"CR-InvoiceReport-prod-YOURCOMPANYCODE-2021-03-01-HHMMSSNN.xml\"\n",
    "output_file_name = \"all_invoices.pdf\"\n",
    "# set include_full_pdf_report to True if you want the verbose version of the invoice image with report\n",
    "include_full_pdf_report = False "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Shouldn't need to change anything below here\n",
    "The rest of the code should require no changes, however, if your export file is not xml, or does not use the xml node `<invoiceExport>/<InvoiceHeader>/<ReportID>` you may need to alter the `get_invoices_from_export` function\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import io\n",
    "import xml.etree.ElementTree as ET\n",
    "import requests\n",
    "from PyPDF2 import PdfFileMerger, PdfFileReader\n",
    "from PyPDF2.utils import PdfReadError"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# However you manage credentials, do that. \n",
    "# for our purposes in this demo they are in a file called 'cr_creds.py' located in the directory as this notebook. That file looks like:\n",
    "\"\"\"\n",
    "api_user = 'youraccountusername'\n",
    "api_pass = 'goodmixoflettersnumbers'\n",
    "api_base_url = 'servername_supplied_by_cr.chromeriver.com'\n",
    "\"\"\"\n",
    "import cr_creds\n",
    "url = f\"https://{cr_creds.api_base_url}/receipts/doit\"\n",
    "credentials = {'un': cr_creds.api_user,\n",
    "               'pw': cr_creds.api_pass,}      "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_invoice_image(invoice_id, url, credentials, include_report=False, stream=True):\n",
    "    \"\"\" \n",
    "    Reaches out to Chrome River's invoice image api, using the getInvoiceImages\n",
    "    https://chromeriver.force.com/s/article/Image-Integration-API-for-INVOICE\n",
    "    \"\"\"\n",
    "    method_parameters = {'method': 'getInvoiceImages',\n",
    "                         'invoiceID':invoice_id,\n",
    "                        }\n",
    "    if include_report:\n",
    "        method_parameters['getPDFReport'] = 'true'  # small case on purpose\n",
    "    return requests.post(url, data={**credentials, **method_parameters}, stream=stream)    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_invoices_from_export(export_file):\n",
    "    \"\"\"\n",
    "    Returns a list of ReportID's from a Chrome River export file.\n",
    "    This may need to be tweaked if the export file has been modified from\n",
    "    the normal offering from Chrome River.\n",
    "    \"\"\"\n",
    "    tree = ET.parse(export_file)\n",
    "    root = tree.getroot()\n",
    "    return [invoice.text for invoice in root.findall(\"InvoiceHeader/ReportID\")]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_and_merge_pdfs(url, credentials, list_of_invoices, filename_to_create, include_report=False):\n",
    "    \"\"\"\n",
    "    Takes a list of invoices and merges them into a single file named filename_to_create.\n",
    "    Returns the file name created, and a list of exceptions for any that could not be read.\n",
    "    \"\"\"\n",
    "    mergedInvoices = PdfFileMerger()\n",
    "    exceptions = list()\n",
    "    for invoice in list_of_invoices:\n",
    "        invoice_pdf = get_invoice_image(invoice, url, credentials, include_report, stream=False)\n",
    "        \n",
    "        if invoice_pdf.status_code == 200:\n",
    "            try:\n",
    "                mergedInvoices.append(io.BytesIO(invoice_pdf.content))\n",
    "            except PdfReadError:\n",
    "                exceptions.append(invoice)\n",
    "                \n",
    "    if list_of_invoices:\n",
    "        mergedInvoices.write(filename_to_create)\n",
    "        return filename_to_create, exceptions\n",
    "    else:\n",
    "        return None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# if you want to specify invoices manually, use this instead       \n",
    "# invoices = ['050001111111',\n",
    "#             '050002222222'  # adding as many as needed in this same way\n",
    "#            ]    \n",
    "\n",
    "# otherwise # get the list of invoices from a Chrome River export:\n",
    "invoices = get_invoices_from_export(export_file_name)\n",
    "\n",
    "# then merge them together\n",
    "get_and_merge_pdfs(url, credentials, invoices, output_file_name, include_report=include_full_pdf_report)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# test a single invoice\n",
    "DEBUGGING = False\n",
    "if DEBUGGING:\n",
    "    import shutil\n",
    "    invoice_id = '050000000001'\n",
    "    r = get_invoice_image(invoice_id, url, credentials, include_report=False, stream=True)\n",
    "\n",
    "    if r.status_code == 200:    \n",
    "        with open(f'invoice_sample2_{invoice_id}.pdf', 'wb') as invoice:\n",
    "            r.raw.decode_content = True\n",
    "            shutil.copyfileobj(r.raw, invoice)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}