{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Computer Vision models zoo" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "hide_input": true }, "outputs": [], "source": [ "from fastai.gen_doc.nbdoc import *\n", "from fastai.vision.models.darknet import Darknet\n", "from fastai.vision.models.wrn import wrn_22, WideResNet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The fastai library includes several pretrained models from [torchvision](https://pytorch.org/docs/stable/torchvision/models.html), namely:\n", "- resnet18, resnet34, resnet50, resnet101, resnet152\n", "- squeezenet1_0, squeezenet1_1\n", "- densenet121, densenet169, densenet201, densenet161\n", "- vgg16_bn, vgg19_bn\n", "- alexnet\n", "\n", "On top of the models offered by torchvision, fastai has implementations for the following models:\n", "- Darknet architecture, which is the base of [Yolo v3](https://pjreddie.com/media/files/papers/YOLOv3.pdf)\n", "- Unet architecture based on a pretrained model. The original unet is described [here](https://arxiv.org/abs/1505.04597), the model implementation is detailed in [`models.unet`](/vision.models.unet.html#vision.models.unet)\n", "- Wide resnets architectures, as introduced in [this article](https://arxiv.org/abs/1605.07146)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class Darknet[source][test]

\n", "\n", "> Darknet(**`num_blocks`**:`Collection`\\[`int`\\], **`num_classes`**:`int`, **`nf`**=***`32`***) :: [`PrePostInitMeta`](/core.html#PrePostInitMeta) :: [`Module`](/torch_core.html#Module)\n", "\n", "
×

No tests found for Darknet. To contribute a test please refer to this guide and this discussion.

\n", "\n", "https://github.com/pjreddie/darknet " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(Darknet)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a Darknet with blocks of sizes given in `num_blocks`, ending with `num_classes` and using `nf` initial features. Darknet53 uses `num_blocks = [1,2,8,8,4]`. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

class WideResNet[source][test]

\n", "\n", "> WideResNet(**`num_groups`**:`int`, **`N`**:`int`, **`num_classes`**:`int`, **`k`**:`int`=***`1`***, **`drop_p`**:`float`=***`0.0`***, **`start_nf`**:`int`=***`16`***, **`n_in_channels`**:`int`=***`3`***) :: [`PrePostInitMeta`](/core.html#PrePostInitMeta) :: [`Module`](/torch_core.html#Module)\n", "\n", "
×

No tests found for WideResNet. To contribute a test please refer to this guide and this discussion.

\n", "\n", "Wide ResNet with `num_groups` and a width of `k`. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(WideResNet)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each group contains `N` blocks. `start_nf` the initial number of features. Dropout of `drop_p` is applied in between the two convolutions in each block. The expected input channel size is fixed at 3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Structure: initial convolution -> `num_groups` x `N` blocks -> final layers of regularization and pooling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " The first block of each group joins a path containing 2 convolutions with filter size 3x3 (and various regularizations) with another path containing a single convolution with a filter size of 1x1. All other blocks in each group follow the more traditional res_block style, i.e., the input of the path with two convs is added to the output of that path.\n", " \n", " In the first group the stride is 1 for all convolutions. In all subsequent groups the stride in the first convolution of the first block is 2 and then all following convolutions have a stride of 1. Padding is always 1." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "hide_input": true }, "outputs": [ { "data": { "text/markdown": [ "

wrn_22[source][test]

\n", "\n", "> wrn_22()\n", "\n", "
×

No tests found for wrn_22. To contribute a test please refer to this guide and this discussion.

\n", "\n", "Wide ResNet with 22 layers. " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "show_doc(wrn_22)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a [`WideResNet`](/vision.models.wrn.html#WideResNet) with `num_groups=3`, `N=3`, `k=6` and `drop_p=0.`." ] } ], "metadata": { "jekyll": { "keywords": "fastai", "summary": "Overview of the models used for CV in fastai", "title": "vision.models" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 2 }