{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# JSON\n",
    "As mentioned in the slides, JSON is very simliar to python's dictionaries. To demonstrate that, we'll go over a couple of examples."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "json_obj = {'Name': 'Interstellar', 'Genres': ['Science Fiction', 'Drama']}\n",
    "\n",
    "# What the raw python looks like\n",
    "print(json_obj)\n",
    "print(type(json_obj))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## JSON String"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "str_obj = json.dumps(json_obj)\n",
    "print(str_obj)\n",
    "print(type(str_obj))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The only real change you may notice is that the single qoutes (**'**) were replaced with double qoutes **\"**. This is because all string objects must be double qouted in proper JSON.\n",
    "\n",
    "Now that it is a string, we can't index into it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "json_obj['Name']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "str_obj['Name']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Loading back in with `json.loads(`*`string`*`)` allows us to resume interacting with the python object:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_obj = json.loads(str_obj)\n",
    "new_obj['Name']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing Out JSON\n",
    "Rather than dumping to a string, we can also dump to a file using `json.dump(`*`file_pointer`*`)`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('test_json.json', 'w') as json_file:\n",
    "    json.dump(json_obj, json_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Likewise we can read that information back in using `json.load(`*`file_pointer`*`)`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('test_json.json', 'r') as json_file:\n",
    "    json_data = json.load(json_file)\n",
    "    \n",
    "print(json_data)\n",
    "print()\n",
    "print(json_data['Name'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Incompatiable Data Types\n",
    "Sometimes when working with json we will need to reinterpret certain python types to ensure that they work with JSON's limitations.\n",
    "\n",
    "For example **datetime** objects don't play nicely with JSON."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from datetime import datetime\n",
    "\n",
    "json_obj = {'Name': 'Interstellar', 'Genres': ['Science Fiction', 'Drama'], 'Release Date': datetime.now()}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(json_obj)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(json.dumps(json_obj))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Serialization\n",
    "\n",
    "In this instance we need to write some code to handle these conversions or serialize the data.\n",
    " - **Serialization**: The process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed later.\n",
    "\n",
    "With python's JSON module we can leverage serializers in the `json.dumps()` function to serialize our data into a string format.\n",
    "```python\n",
    "json.dumps(python_obj, default=json_serilaizer)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def json_serializer(obj):\n",
    "    if isinstance(obj, (datetime)):\n",
    "        return obj.strftime(\"%Y-%m-%d %H:%M:%S\")\n",
    "    \n",
    "    \n",
    "print(json.dumps(json_obj, default=json_serializer, indent=4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other Serialization\n",
    "\n",
    "Pickle and Dill are two python serialization packages we commonly use to save out python objects for later use. This might be a dataframe, machine learning model, or some other object.\n",
    "\n",
    "For deep learning we typically save our models out in HDF5 (Hierarchial Data Format).\n",
    "\n",
    "### Dill/Pickle\n",
    "Similar to the json library, dill/pickle support the `.dump()` and `.load()` methods"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import dill as pkl\n",
    "\n",
    "with open('test_file.pkl', 'wb') as pkl_file:\n",
    "    pkl.dump(json_obj, pkl_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can show this worked by reading back in the serialized file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('test_file.pkl', 'rb') as pkl_file:\n",
    "    data_obj = pkl.load(pkl_file)\n",
    "    print(data_obj)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}