{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# \"Deep Learning cheat sheet for Python\"\n",
    "- toc:true\n",
    "- categories: [\"deep-learning\"]\n",
    "- image: images/copied_from_nb/images/dl-cheat-sheet.jpg\n",
    "- comments: true"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having worked on Deep Learning for almost a year, I kept noting useful resources and snippets I used frequently. I've compiled all of them in one place below.\n",
    "**Feel free to share some of your hacks/resources/snippets below. Leave a comment below, pull requests are welcome too!**  \n",
    "\n",
    "![](images/dl-cheat-sheet.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Jupyter clear GPU Memory\n",
    "\n",
    "There are times I just cant afford to restart the kernel lol. The snipped below helps me in those cases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# collapse-hide\n",
    "import gc\n",
    "def dump_tensors(gpu_only=True):\n",
    "        torch.cuda.empty_cache()\n",
    "        total_size = 0\n",
    "        for obj in gc.get_objects():\n",
    "            # print(obj)\n",
    "            try:\n",
    "                if torch.is_tensor(obj):\n",
    "                    if obj.is_cuda:\n",
    "                        del obj\n",
    "                        gc.collect()\n",
    "                elif hasattr(obj, \"data\") and torch.is_tensor(obj.data):\n",
    "                    if not gpu_only or obj.is_cuda:\n",
    "                        del obj\n",
    "                        gc.collect()\n",
    "            except Exception as e:\n",
    "                pass\n",
    "dump_tensors()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Free Quality Courses\n",
    "\n",
    "- [Machine Learning](https://www.youtube.com/playlist?list=PLl8OlHZGYOQ7bkVbuRthEsaLr7bONzbXS)\n",
    "    - Taught by a professor in Cornell, love his style of teaching\n",
    "- [Deep Learning Fastai](https://course.fast.ai/)\n",
    "- [Deep Learning for Visual Recognition](https://www.youtube.com/playlist?list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv)\n",
    "- [Variational Autoencoder](https://www.youtube.com/playlist?list=PLdxQ7SoCLQANizknbIiHzL_hYjEaI-wUe)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Free GPU resources\n",
    "\n",
    "Resources below are mostly student focussed"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- [a good list of free resources](https://course.fast.ai/#ready-to-run-one-click-jupyter)\n",
    "- [free Azure,AWS and MongoDB creds (Student Developer Pack)](https://education.github.com/pack/offers)\n",
    "- [google cloud resources](https://google.dev/edu)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computer Vision models\n",
    "\n",
    "- [PyTorch Image Models](https://github.com/rwightman/pytorch-image-models)\n",
    "- [fastai](https://github.com/fastai/fastai)\n",
    "- [(Generic) EfficientNets for PyTorch](https://github.com/rwightman/gen-efficientnet-pytorch)\n",
    "- [EfficientNet PyTorch](https://github.com/lukemelas/EfficientNet-PyTorch)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## GANs\n",
    "\n",
    "- [Improved GAN (Semi-supervised GAN)](https://github.com/Sleepychord/ImprovedGAN-pytorch)\n",
    "- [Pytorch GAN](https://github.com/eriklindernoren/PyTorch-GAN)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training tools\n",
    "\n",
    "- [tqdm: A progressbar](https://github.com/tqdm/tqdm)\n",
    "- [livelossplot](https://github.com/stared/livelossplot)\n",
    "    - best way to analyse plots while training\n",
    "- [The most lightweight experiment management tool that fits any workflow](https://neptune.ai/)\n",
    "- [fastpages](https://github.com/fastai/fastpages)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pre/Post processing\n",
    "\n",
    "- [Train-val split pytorch](https://gist.github.com/kevinzakka/d33bf8d6c7f06a9d8c76d97a7879f5cb)\n",
    "- [Augmix](https://github.com/google-research/augmix)\n",
    "- [Cutmix](https://github.com/clovaai/CutMix-PyTorch)\n",
    "- [Quick test time augmentation](https://github.com/qubvel/ttach)\n",
    "- [librosa: Audio extraction](https://github.com/librosa/librosa)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## K Fold Cross Validation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# collapse-hide\n",
    "num_splits = 5\n",
    "all_probs = torch.zeros(610, 3, dtype=torch.float32)\n",
    "# for i in range(2,2+1):\n",
    "train_dataset, valid_dataset = get_train_valid_dataset(PROJECT_PATH + \"data/train/\",batch_size=batch_size,augment=True,\n",
    "                                                    random_seed=42,valid_size=0.2,shuffle=True,show_sample=False,\n",
    "                                                    num_workers=4,pin_memory=True,split_no=1)\n",
    "skf = StratifiedKFold(n_splits=5,shuffle=True,random_state=42)\n",
    "# https://discuss.pytorch.org/t/how-can-i-use-sklearn-kfold-with-imagefolder/36577\n",
    "# https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html#sklearn.model_selection.StratifiedKFold\n",
    "\n",
    "pin_memory=True\n",
    "for fold_num, (train_index, test_index) in enumerate(tq(skf.split(train_dataset, train_dataset.targets))):\n",
    "  # valid_subset = torch.utils.data.Subset(valid_dataset,test_index)\n",
    "  train_sampler = ImbalancedDatasetSampler(train_dataset,train_index)\n",
    "  valid_sampler = ImbalancedDatasetSampler(valid_dataset,test_index)\n",
    "\n",
    "  train_loader = torch.utils.data.DataLoader(\n",
    "      train_dataset, batch_size=batch_size, sampler=train_sampler,\n",
    "      num_workers=num_workers, pin_memory=pin_memory\n",
    "  )\n",
    "  valid_loader = torch.utils.data.DataLoader(\n",
    "      valid_dataset, batch_size=batch_size, sampler=valid_sampler,\n",
    "      num_workers=num_workers, pin_memory=pin_memory\n",
    "  )\n",
    "  \n",
    "  model_name = 'densenet201'\n",
    "  optim_name = 'AdamW'\n",
    "  lr = 8e-5\n",
    "  PARAMS = {'learning_rate' : lr,\n",
    "            'n_epochs' : 40,\n",
    "            'optimizer' : optim_name,\n",
    "            'model' : model_name,\n",
    "            'fold' : fold_num,\n",
    "            'save_name': model_name+'_TTA_'+optim_name+'_fold_'+str(fold_num)+\"_img_\"+str(img_size)+'_lr-'+str(lr)\n",
    "            }\n",
    "  neptune.create_experiment(name='pytorch-'+model_name+'-Adam', params=PARAMS)\n",
    "  print(\"Started train for fold\",fold_num)\n",
    "  model = trainCNN(PARAMS['n_epochs'],train_loader,valid_loader,model_name,True,\n",
    "                  PARAMS['save_name'],PARAMS['learning_rate'],False)\n",
    "  preds_path,preds_prob = predict(test_loader, model,True)\n",
    "  \n",
    "  make_submission(preds_prob,preds_path,PARAMS['save_name'])\n",
    "\n",
    "  all_probs+= preds_prob\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cyclic loaders\n",
    "\n",
    "The standard loader from itertools takes a lot of RAM. The piece of code below reduced my RAM usage by 5 times.  \n",
    "Thanks to rmrao for the [solution](https://github.com/pytorch/pytorch/issues/23900#issuecomment-518858050)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# collapse-hide\n",
    "def cycle(iterable):\n",
    "    iterator = iter(iterable)\n",
    "    while True:\n",
    "        try:\n",
    "            yield next(iterator)\n",
    "        except StopIteration:\n",
    "            iterator = iter(iterable)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reproducibility\n",
    "\n",
    "Quick function to set all the seeds of Pytorch"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# collapse-hide\n",
    "# seeding function for reproducibility\n",
    "def seed_everything(seed):\n",
    "    random.seed(seed)\n",
    "    os.environ[\"PYTHONHASHSEED\"] = str(seed)\n",
    "    np.random.seed(seed)\n",
    "    torch.manual_seed(seed)\n",
    "    torch.cuda.manual_seed(seed)\n",
    "    torch.backends.cudnn.deterministic = True"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}