{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 目录\n", "\n", "> 因为 MXNet 的某些概念及操作在 PyTorch 无法实现,在相应章节处做了 “ ~~无效章节~~ ” 标注" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### [前言](chapter_preface/preface.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. [深度学习简介](chapter_introduction/deep-learning-intro.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. 预备知识\n", "在动手学习之前,我们需要获取本书的代码,并安装运行本书的代码所需要的软件。作为动手学深度学习的基础,我们还需要了解如何对内存中的数据进行操作,以及对函数求梯度的方法。最后,我们应养成主动查阅文档来学习代码的良好习惯。\n", "\n", "#### 2.1. [获取和运行本书的代码](chapter_prerequisite/install.ipynb)\n", "#### 2.2. [数据操作](chapter_prerequisite/tensor.ipynb)\n", "#### 2.3. [自动求梯度](chapter_prerequisite/autograd.ipynb)\n", "#### 2.4. [查阅文档](chapter_prerequisite/lookup-api.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. 深度学习基础\n", "从本章开始,我们将探索深度学习的奥秘。作为机器学习的一类,深度学习通常基于神经网络模型逐级表示越来越抽象的概念或模式。我们先从线性回归和 softmax 回归这两种单层神经网络入手,简要介绍机器学习中的基本概念。然后,我们由单层神经网络延伸到多层神经网络,并通过多层感知机引入深度学习模型。在观察和了解了模型的过拟合现象后,我们将介绍深度学习中应对过拟合的常用方法:权重衰减和丢弃法。接着,为了进一步理解深度学习模型训练的本质,我们将详细解释正向传播和反向传播。掌握这两个概念后,我们能更好地认识深度学习中的数值稳定性和初始化的一些问题。最后,我们通过一个深度学习应用案例对本章内容学以致用。\n", "\n", "在本章的前几节,我们先介绍单层神经网络:线性回归和softmax回归。\n", "\n", "#### 3.1. [线性回归](chapter_deep-learning-basics/linear-regression.ipynb)\n", "#### 3.2. [线性回归的从零开始实现](chapter_deep-learning-basics/linear-regression-scratch.ipynb)\n", "#### 3.3. [线性回归的简洁实现](chapter_deep-learning-basics/linear-regression-nn.ipynb)\n", "#### 3.4. [softmax回归](chapter_deep-learning-basics/softmax-regression.ipynb)\n", "#### 3.5. [图像分类数据集(Fashion-MNIST)](chapter_deep-learning-basics/fashion-mnist.ipynb)\n", "#### 3.6. [softmax回归的从零开始实现](chapter_deep-learning-basics/softmax-regression-scratch.ipynb)\n", "#### 3.7. [softmax回归的简洁实现](chapter_deep-learning-basics/softmax-regression-nn.ipynb)\n", "#### 3.8. [多层感知机](chapter_deep-learning-basics/mlp.ipynb)\n", "#### 3.9. [多层感知机的从零开始实现](chapter_deep-learning-basics/mlp-scratch.ipynb)\n", "#### 3.10. [多层感知机的简洁实现](chapter_deep-learning-basics/mlp-nn.ipynb)\n", "#### 3.11. [模型选择、欠拟合和过拟合](chapter_deep-learning-basics/underfit-overfit.ipynb)\n", "#### 3.12. [权重衰减](chapter_deep-learning-basics/weight-decay.ipynb)\n", "#### 3.13. [丢弃法](chapter_deep-learning-basics/dropout.ipynb)\n", "#### 3.14. [正向传播、反向传播和计算图](chapter_deep-learning-basics/backprop.ipynb)\n", "#### 3.15. [数值稳定性和模型初始化](chapter_deep-learning-basics/numerical-stability-and-init.ipynb)\n", "#### 3.16. [实战Kaggle比赛:房价预测](chapter_deep-learning-basics/kaggle-house-price.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. 深度学习计算\n", "上一章介绍了包括多层感知机在内的简单深度学习模型的原理和实现。本章我们将简要概括深度学习计算的各个重要组成部分,如模型构造、参数的访问和初始化等,自定义层,读取、存储和使用GPU。通过本章的学习,我们将能够深入了解模型实现和计算的各个细节,并为在之后章节实现更复杂模型打下坚实的基础。\n", "\n", "#### 4.1. [模型构造](chapter_deep-learning-computation/model-construction.ipynb)\n", "#### 4.2. [模型参数的访问、初始化和共享](chapter_deep-learning-computation/parameters.ipynb)\n", "#### ~~4.3. 模型参数的延后初始化~~\n", "#### 4.4. [自定义层](chapter_deep-learning-computation/custom-layer.ipynb)\n", "#### 4.5. [读取和存储](chapter_deep-learning-computation/read-write.ipynb)\n", "#### 4.6. [GPU计算](chapter_deep-learning-computation/use-gpu.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5. 卷积神经网络\n", "本章将介绍卷积神经网络。它是近年来深度学习能在计算机视觉领域取得突破性成果的基石。它也逐渐在被其他诸如自然语言处理、推荐系统和语音识别等领域广泛使用。我们将先描述卷积神经网络中卷积层和池化层的工作原理,并解释填充、步幅、输入通道和输出通道的含义。在掌握了这些基础知识以后,我们将探究数个具有代表性的深度卷积神经网络的设计思路。这些模型包括最早提出的AlexNet,以及后来的使用重复元素的网络(VGG)、网络中的网络(NiN)、含并行连结的网络(GoogLeNet)、残差网络(ResNet)和稠密连接网络(DenseNet)。它们中有不少在过去几年的ImageNet比赛(一个著名的计算机视觉竞赛)中大放异彩。虽然深度模型看上去只是具有很多层的神经网络,然而获得有效的深度模型并不容易。有幸的是,本章阐述的批量归一化和残差网络为训练和设计深度模型提供了两类重要思路。\n", "\n", "#### 5.1. [二维卷积层](chapter_convolutional-neural-networks/conv-layer.ipynb)\n", "#### 5.2. [填充和步幅](chapter_convolutional-neural-networks/padding-and-strides.ipynb)\n", "#### 5.3. [多输入通道和多输出通道](chapter_convolutional-neural-networks/channels.ipynb)\n", "#### 5.4. [池化层](chapter_convolutional-neural-networks/pooling.ipynb)\n", "#### 5.5. [卷积神经网络(LeNet)](chapter_convolutional-neural-networks/lenet.ipynb)\n", "#### 5.6. [深度卷积神经网络(AlexNet)](chapter_convolutional-neural-networks/alexnet.ipynb)\n", "#### 5.7. [使用重复元素的网络(VGG)](chapter_convolutional-neural-networks/vgg.ipynb)\n", "#### 5.8. [网络中的网络(NiN)](chapter_convolutional-neural-networks/nin.ipynb)\n", "#### 5.9. [含并行连结的网络(GoogLeNet)](chapter_convolutional-neural-networks/googlenet.ipynb)\n", "#### 5.10. [批量归一化](chapter_convolutional-neural-networks/batch-norm.ipynb)\n", "#### 5.11. [残差网络(ResNet)](chapter_convolutional-neural-networks/resnet.ipynb)\n", "#### 5.12. [稠密连接网络(DenseNet)](chapter_convolutional-neural-networks/densenet.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6. 循环神经网络\n", "与之前介绍的多层感知机和能有效处理空间信息的卷积神经网络不同,循环神经网络是为更好地处理时序信息而设计的。它引入状态变量来存储过去的信息,并用其与当前的输入共同决定当前的输出。\n", "\n", "循环神经网络常用于处理序列数据,如一段文字或声音、购物或观影的顺序,甚至是图像中的一行或一列像素。因此,循环神经网络有着极为广泛的实际应用,如语言模型、文本分类、机器翻译、语音识别、图像分析、手写识别和推荐系统。\n", "\n", "因为本章中的应用是基于语言模型的,所以我们将先介绍语言模型的基本概念,并由此激发循环神经网络的设计灵感。接着,我们将描述循环神经网络中的梯度计算方法,从而探究循环神经网络训练可能存在的问题。对于其中的部分问题,我们可以使用本章稍后介绍的含门控的循环神经网络来解决。最后,我们将拓展循环神经网络的架构。\n", "\n", "#### 6.1. [语言模型](chapter_recurrent-neural-networks/lang-model.ipynb)\n", "#### 6.2. [循环神经网络](chapter_recurrent-neural-networks/rnn.ipynb)\n", "#### 6.3. [语言模型数据集(周杰伦专辑歌词)](chapter_recurrent-neural-networks/lang-model-dataset.ipynb)\n", "#### 6.4. [循环神经网络的从零开始实现](chapter_recurrent-neural-networks/rnn-scratch.ipynb)\n", "#### 6.5. [循环神经网络的简洁实现](chapter_recurrent-neural-networks/rnn-nn.ipynb)\n", "#### 6.6. [通过时间反向传播](chapter_recurrent-neural-networks/bptt.ipynb)\n", "#### 6.7. [门控循环单元(GRU)](chapter_recurrent-neural-networks/gru.ipynb)\n", "#### 6.8. [长短期记忆(LSTM)](chapter_recurrent-neural-networks/lstm.ipynb)\n", "#### 6.9. [深度循环神经网络](chapter_recurrent-neural-networks/deep-rnn.ipynb)\n", "#### 6.10. [双向循环神经网络](chapter_recurrent-neural-networks/bi-rnn.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 7. 优化算法\n", "\n", "如果你一直按照本书的顺序读到这里,那么你已经使用了优化算法来训练深度学习模型。具体来说,在训练模型时,我们会使用优化算法不断迭代模型参数以降低模型损失函数的值。当迭代终止时,模型的训练随之终止,此时的模型参数就是模型通过训练所学习到的参数。\n", "\n", "优化算法对于深度学习十分重要。一方面,训练一个复杂的深度学习模型可能需要数小时、数日,甚至数周时间,而优化算法的表现直接影响模型的训练效率;另一方面,理解各种优化算法的原理以及其中超参数的意义将有助于我们更有针对性地调参,从而使深度学习模型表现更好。\n", "\n", "本章将详细介绍深度学习中常用的优化算法。\n", "\n", "#### 7.1. [优化与深度学习](chapter_optimization/optimization-intro.ipynb)\n", "#### 7.2. [梯度下降和随机梯度下降](chapter_optimization/gd-sgd.ipynb)\n", "#### 7.3. [小批量随机梯度下降](chapter_optimization/minibatch-sgd.ipynb)\n", "#### 7.4. [动量法](chapter_optimization/momentum.ipynb)\n", "#### 7.5. [AdaGrad算法](chapter_optimization/adagrad.ipynb)\n", "#### 7.6. [RMSProp算法](chapter_optimization/rmsprop.ipynb)\n", "#### 7.7. [AdaDelta算法](chapter_optimization/adadelta.ipynb)\n", "#### 7.8. [Adam算法](chapter_optimization/adam.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 10. 自然语言处理¶\n", "自然语言处理关注计算机与人类之间的自然语言交互。在实际中,我们常常使用自然语言处理技术,如“循环神经网络”一章中介绍的语言模型,来处理和分析大量的自然语言数据。本章中,根据输入与输出的不同形式,我们按“定长到定长”、“不定长到定长”、“不定长到不定长”的顺序,逐步展示在自然语言处理中如何表征并变换定长的词或类别以及不定长的句子或段落序列。\n", "\n", "我们先介绍如何用向量表示词,并在语料库上训练词向量。之后,我们把在更大语料库上预训练的词向量应用于求近义词和类比词,即“定长到定长”。接着,在文本分类这种“不定长到定长”的任务中,我们进一步应用词向量来分析文本情感,并分别基于循环神经网络和卷积神经网络为表征时序数据提供两种思路。此外,自然语言处理任务中很多输出是不定长的,如任意长度的句子或段落。我们将描述应对这类问题的编码器—解码器模型、束搜索和注意力机制,并动手实践“不定长到不定长”的机器翻译任务。\n", "#### 10.1. [词嵌入(word2vec)](chapter_natural-language-processing/word2vec.ipynb)\n", "#### 10.2. [近似训练](chapter_natural-language-processing/approx-training.ipynb)\n", "#### 10.3. [word2vec的实现](chapter_natural-language-processing/word2vec-nn.ipynb)\n", "#### 10.4. [子词嵌入(fastText)](chapter_natural-language-processing/fasttext.ipynb)\n", "#### 10.5. [全局向量的词嵌入(GloVe)](chapter_natural-language-processing/glove.ipynb)\n", "#### 10.6. [求近义词和类比词](chapter_natural-language-processing/similarity-analogy.ipynb)\n", "#### 10.7. [文本情感分类:使用循环神经网络](chapter_natural-language-processing/sentiment-analysis-rnn.ipynb)\n", "#### 10.8. [文本情感分类:使用卷积神经网络(textCNN)](chapter_natural-language-processing/sentiment-analysis-cnn.ipynb)\n", "#### 10.9. [编码器—解码器(seq2seq)](chapter_natural-language-processing/seq2seq.ipynb)\n", "#### 10.10. [束搜索](chapter_natural-language-processing/beam-search.ipynb)\n", "#### 10.11. [注意力机制](chapter_natural-language-processing/attention.ipynb)\n", "#### 10.12. [机器翻译](chapter_natural-language-processing/machine-translation.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 11. 附录\n", "#### 11.1. [主要符号一览](chapter_appendix/notation.ipynb)\n", "#### 11.2. [数学基础](chapter_appendix/math.ipynb)\n", "#### 11.3. [使用Jupyter记事本](chapter_appendix/jupyter.ipynb)\n", "#### 11.5. [GPU购买指南](chapter_appendix/buy-gpu.ipynb)\n", "#### 11.7. [d2ltorch包索引](chapter_appendix/d2ltorch.ipynb)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "256px" }, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }