{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/d2l-ai/d2l-pytorch-sagemaker-studio-lab/blob/main/chapter_convolutional-neural-networks/lenet.ipynb)\n", "\n", "# 6.6 Convolutional Neural Networks (LeNet)\n", "\n", "\n", "We now have all the ingredients required to assemble\n", "a fully-functional CNN.\n", "In our earlier encounter with image data,\n", "we applied\n", "a softmax regression model ([Section 3.6](../chapter_linear-networks/softmax-regression-scratch.ipynb))\n", "and\n", "an MLP model ([Section 4.2](../chapter_multilayer-perceptrons/mlp-scratch.ipynb))\n", "to pictures of clothing in the Fashion-MNIST dataset.\n", "To make such data amenable to softmax regression and MLPs,\n", "we first flattened each image from a $28\\times28$ matrix\n", "into a fixed-length $784$-dimensional vector,\n", "and thereafter processed them with fully-connected layers.\n", "Now that we have a handle on convolutional layers,\n", "we can retain the spatial structure in our images.\n", "As an additional benefit of replacing fully-connected layers with convolutional layers,\n", "we will enjoy more parsimonious models that require far fewer parameters.\n", "\n", "In this section, we will introduce *LeNet*,\n", "among the first published CNNs\n", "to capture wide attention for its performance on computer vision tasks.\n", "The model was introduced by (and named for) Yann LeCun,\n", "then a researcher at AT&T Bell Labs,\n", "for the purpose of recognizing handwritten digits in images [LeCun et al., 1998](../chapter_references/zreferences.ipynb#LeCun.Bottou.Bengio.ea.1998).\n", "This work represented the culmination\n", "of a decade of research developing the technology.\n", "In 1989, LeCun published the first study to successfully\n", "train CNNs via backpropagation.\n", "\n", "\n", "At the time LeNet achieved outstanding results\n", "matching the performance of support vector machines,\n", "then a dominant approach in supervised learning.\n", "LeNet was eventually adapted to recognize digits\n", "for processing deposits in ATM machines.\n", "To this day, some ATMs still run the code\n", "that Yann and his colleague Leon Bottou wrote in the 1990s!\n", "\n", "\n", "## 6.6.1 LeNet\n", "\n", "At a high level, LeNet (LeNet-5) consists of two parts:\n", "(i) a convolutional encoder consisting of two convolutional layers; and\n", "(ii) a dense block consisting of three fully-connected layers;\n", "The architecture is summarized in [Fig. 6.6.1](#fig6.6.1).\n", "\n", "![Data flow in LeNet. The input is a handwritten digit, the output a probability over 10 possible outcomes.](../img/lenet.svg)\n", "