{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# skorch doctor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes, when working with neural nets, you'll find that it's quite hard to train them. Often in those cases, it's not that easy to say what's wrong. Do we have bad data? A bad architecture? Or did we just choose bad hyper-parameters?\n", "\n", "It's hard to rule out the latter, because there are so many choices. When we have a small problem, we can probably run a grid search or something similar to find the best hyper-parameters, but what if that would be too expensive? Wouldn't it be better if we could inspect the net more closely and find if we are, for instance, dealing with vanishing or exploding gradients?\n", "\n", "Thankfully, PyTorch provides facilities like hooks to enable this. But using them correctly can be quite cumbersome. This is where `SkorchDoctor` comes into play. This class allows us to wrap the skorch net and automatically collect important data about our training process that allows us to debug what can be improved." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Many of the tips used here are taken from Andrej Karpathy's excellent content:\n", "\n", "- https://karpathy.github.io/2019/04/25/recipe/\n", "- https://www.youtube.com/watch?v=P6sfmUTpUmc\n", "\n", "\n", "This blog post has also been very helpful when it comes to understanding possible problems in training transformer models specifically:\n", "\n", "- https://www.borealisai.com/research-blogs/tutorial-17-transformers-iii-training/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To run this notebook, you need skorch version `0.12.2` or higher. If not released yet, please install from the master branch on github:\n", "\n", "`python -m pip install git+https://github.com/skorch-dev/skorch.git`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", " Run in Google Colab \n", " | \n", "View source on GitHub |