{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e59a56ee",
   "metadata": {},
   "source": [
    "**<font color = black size=6>实验四：神经网络中的前向传播与后向传播</font>**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "62f100c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eebaf0fe",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第一部分:PyTorch介绍</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a394db6",
   "metadata": {},
   "source": [
    "这里介绍一小部分PyTorch常用的库和函数，更多需求可参阅[PyTorch官方教程](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)以及[PyTorch官方文档](https://pytorch.org/docs/stable/index.html)。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "6fae149b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.1.0+cpu\n"
     ]
    }
   ],
   "source": [
    "import torch # 导入的是 torch 而不是 pytorch\n",
    "print(torch.__version__) # 输出当前pytorch的版本"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "392780bb",
   "metadata": {},
   "source": [
    "1.Tensor"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "904623e4",
   "metadata": {},
   "source": [
    "Tensor与NumPy中的ndarray很相似，但Tensor可以利用GPU来加速计算（虽然本门课不用）。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23642996",
   "metadata": {},
   "source": [
    "1.1. Tensor的创建"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "38cdf3ad",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[-2.9818e-08,  1.8357e-42,  0.0000e+00],\n",
      "        [ 0.0000e+00,  0.0000e+00,  0.0000e+00]])\n",
      "tensor([[1, 2, 3],\n",
      "        [4, 5, 6]])\n",
      "tensor([[0.6417, 0.6556, 0.7544, 0.3146],\n",
      "        [0.3440, 0.0655, 0.0828, 0.5430],\n",
      "        [0.5535, 0.4441, 0.1285, 0.8876]])\n",
      "tensor([[0., 0., 0.],\n",
      "        [0., 0., 0.]])\n",
      "tensor([[1., 1., 1.],\n",
      "        [1., 1., 1.]])\n"
     ]
    }
   ],
   "source": [
    "# 创建一个未初始化的Tensor\n",
    "x = torch.empty(2, 3)\n",
    "print(x)\n",
    "\n",
    "# 从一个列表创建Tensor\n",
    "x = torch.tensor([[1,2,3],[4,5,6]])\n",
    "print(x)\n",
    "\n",
    "# 创建一个随机Tensor\n",
    "x = torch.rand([3, 4])\n",
    "print(x)\n",
    "\n",
    "# 创建一个全零Tensor\n",
    "x = torch.zeros([2, 3])\n",
    "print(x)\n",
    "\n",
    "# 创建一个全一Tensor\n",
    "x = torch.ones([2, 3])\n",
    "print(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f354fd4",
   "metadata": {},
   "source": [
    "1.2. Tensor的运算"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "8070a903",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[7, 7, 7],\n",
      "        [7, 7, 7]])\n",
      "tensor([[-5, -3, -1],\n",
      "        [ 1,  3,  5]])\n",
      "tensor([[ 6, 10, 12],\n",
      "        [12, 10,  6]])\n",
      "tensor([[ 6, 10, 12],\n",
      "        [12, 10,  6]])\n",
      "tensor([[28, 10],\n",
      "        [73, 28]])\n",
      "tensor([[28, 10],\n",
      "        [73, 28]])\n",
      "tensor([[1, 2],\n",
      "        [3, 4],\n",
      "        [5, 6]])\n",
      "tensor([[1, 2, 3],\n",
      "        [4, 5, 6],\n",
      "        [6, 5, 4],\n",
      "        [3, 2, 1]])\n",
      "tensor([[1, 2, 3, 6, 5, 4],\n",
      "        [4, 5, 6, 3, 2, 1]])\n"
     ]
    }
   ],
   "source": [
    "# 加减法\n",
    "x = torch.tensor([[1,2,3],[4,5,6]])\n",
    "y = torch.tensor([[6,5,4],[3,2,1]])\n",
    "print(x + y)\n",
    "print(x - y)\n",
    "\n",
    "# 对应位置相乘\n",
    "print(x * y)\n",
    "print(x.mul(y))\n",
    "\n",
    "# 矩阵乘法\n",
    "print(x.matmul(y.T))\n",
    "print(x @ y.T)\n",
    "\n",
    "# reshape\n",
    "print(x.reshape(3, 2))\n",
    "\n",
    "# 拼接\n",
    "print(torch.cat([x,y], dim=0)) # 纵向拼接\n",
    "print(torch.cat([x,y], dim=1)) # 横向拼接"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95961cee",
   "metadata": {},
   "source": [
    "1.3. Tensor与ndarray的相互转换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "d1269212",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[1, 2, 3],\n",
      "        [4, 5, 6]])\n",
      "[[1 2 3]\n",
      " [4 5 6]]\n",
      "[[0 0 0]\n",
      " [0 0 0]]\n",
      "tensor([[0, 0, 0],\n",
      "        [0, 0, 0]])\n"
     ]
    }
   ],
   "source": [
    "x = torch.tensor([[1,2,3],[4,5,6]])\n",
    "print(x)\n",
    "\n",
    "# 从Tensor转换到ndarray\n",
    "y = x.numpy()\n",
    "print(y)\n",
    "\n",
    "# Tensor与ndarray是共享空间的\n",
    "x[:]=0\n",
    "print(y)\n",
    "\n",
    "# 从ndarray到Tensor\n",
    "z = torch.from_numpy(y)\n",
    "print(z)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb6c36a7",
   "metadata": {},
   "source": [
    "2.自动求梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "3b30fec3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[3., 4.]]) tensor(1.)\n",
      "tensor([0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966,\n",
      "        0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966,\n",
      "        0.1966, 0.1966])\n",
      "tensor(4.)\n",
      "tensor(5.)\n",
      "tensor(0.)\n"
     ]
    }
   ],
   "source": [
    "a = torch.tensor([[1.,2.]], requires_grad=True) # 把requires_grad设为True, 开始跟踪基于它的所有运算\n",
    "b = torch.tensor([[3.],[4.]])\n",
    "c = torch.tensor(5., requires_grad=True)\n",
    "y = a @ b + c\n",
    "y.backward() #自动计算梯度\n",
    "print(a.grad, c.grad) #输出叶子节点a和c的梯度\n",
    "\n",
    "# 可支持多种运算求梯度，如torch.mean(),torch.sum()等\n",
    "a = torch.ones(20, requires_grad=True)\n",
    "z = torch.sum(torch.sigmoid(a))\n",
    "z.backward()\n",
    "print(a.grad)\n",
    "\n",
    "# 多次求梯度时梯度会累加，可使用tensor.grad.zero_()进行手动清零\n",
    "x = torch.tensor(2., requires_grad=True)\n",
    "y = x ** 2\n",
    "y.backward()\n",
    "print(x.grad)\n",
    "z = x + 3\n",
    "z.backward()\n",
    "print(x.grad)\n",
    "x.grad.zero_()\n",
    "print(x.grad)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "96d22808",
   "metadata": {},
   "source": [
    "3. 神经网络（官方教程中的例子）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "207b6433",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Net(\n",
      "  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
      "  (fc1): Linear(in_features=400, out_features=120, bias=True)\n",
      "  (fc2): Linear(in_features=120, out_features=84, bias=True)\n",
      "  (fc3): Linear(in_features=84, out_features=10, bias=True)\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "\n",
    "# 继承自nn.Module\n",
    "class Net(nn.Module):\n",
    "\n",
    "    def __init__(self):\n",
    "        super(Net, self).__init__()\n",
    "        # 卷积层\n",
    "        # 1 input image channel, 6 output channels, 5x5 square convolution\n",
    "        # kernel\n",
    "        self.conv1 = nn.Conv2d(1, 6, 5)\n",
    "        self.conv2 = nn.Conv2d(6, 16, 5)\n",
    "        # an affine operation: y = Wx + b\n",
    "        # nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)\n",
    "        # 其中in_features表示有多少个输入，out_features表示该层有多少个神经元\n",
    "        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension\n",
    "        self.fc2 = nn.Linear(120, 84)\n",
    "        self.fc3 = nn.Linear(84, 10)\n",
    "\n",
    "    def forward(self, x):\n",
    "        # Max pooling over a (2, 2) window\n",
    "        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
    "        # If the size is a square, you can specify with a single number\n",
    "        x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
    "        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = F.relu(self.fc2(x))\n",
    "        x = self.fc3(x)\n",
    "        return x\n",
    "\n",
    "\n",
    "net = Net()\n",
    "print(net)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "6c2a943f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10\n",
      "torch.Size([6, 1, 5, 5])\n"
     ]
    }
   ],
   "source": [
    "# 该神经网络中可学习的参数可以通过net.parameters()访问\n",
    "params = list(net.parameters())\n",
    "print(len(params))\n",
    "print(params[0].size())  # conv1's .weight"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "b733263e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[-0.0647,  0.0301,  0.0526, -0.1407,  0.0690, -0.0806, -0.0420,  0.1001,\n",
      "          0.0486,  0.1200]], grad_fn=<AddmmBackward0>)\n"
     ]
    }
   ],
   "source": [
    "# 随机生成一个输入送入net中，除了第0维是样本个数外，其余维度要与forward()参数中x的维度对应上\n",
    "input = torch.randn(1, 1, 32, 32) # 1个样本，该样本是有1个通道的32×32的图像\n",
    "out = net(input) # 进行一次forward()前向传播\n",
    "print(out)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "168659b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "net.zero_grad() # 手动清零神经网络中参数的梯度\n",
    "out.backward(torch.randn(1, 10)) # 用随机梯度进行一次backward()后向传播来计算梯度"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "af35f14f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor(0.5634, grad_fn=<MseLossBackward0>)\n"
     ]
    }
   ],
   "source": [
    "output = net(input)\n",
    "target = torch.randn(10)  # a dummy target, for example\n",
    "target = target.view(1, -1)  # make it the same shape as output\n",
    "# nn模块提供了许多种类的损失函数，如nn.CrossEntropyLoss()、nn.MSELoss()等等\n",
    "criterion = nn.MSELoss()\n",
    "\n",
    "loss = criterion(output, target)\n",
    "print(loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1428f7ce",
   "metadata": {},
   "source": [
    "计算图如下：\n",
    "input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d\n",
    "      -> flatten -> linear -> relu -> linear -> relu -> linear\n",
    "      -> MSELoss\n",
    "      -> loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "e9ada5be",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<MseLossBackward0 object at 0x000001B3DCE9A040>\n",
      "<AddmmBackward0 object at 0x000001B3DCE9A4C0>\n",
      "<AccumulateGrad object at 0x000001B3DCE9A040>\n"
     ]
    }
   ],
   "source": [
    "# 查看计算图中的函数\n",
    "print(loss.grad_fn)  # MSELoss\n",
    "print(loss.grad_fn.next_functions[0][0])  # Linear\n",
    "print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "854ee4d1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "conv1.bias.grad before backward\n",
      "None\n",
      "conv1.bias.grad after backward\n",
      "tensor([ 0.0008, -0.0093, -0.0016,  0.0013,  0.0035, -0.0099])\n"
     ]
    }
   ],
   "source": [
    "net.zero_grad()     # zeroes the gradient buffers of all parameters\n",
    "\n",
    "print('conv1.bias.grad before backward')\n",
    "print(net.conv1.bias.grad)\n",
    "\n",
    "# 进行一次后向传播\n",
    "loss.backward()\n",
    "\n",
    "print('conv1.bias.grad after backward')\n",
    "print(net.conv1.bias.grad)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "d7187507",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 用梯度下降法(手动)更新net中的参数\n",
    "learning_rate = 0.01\n",
    "for f in net.parameters():\n",
    "    f.data.sub_(f.grad.data * learning_rate)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "06d237aa",
   "metadata": {},
   "outputs": [],
   "source": [
    "# 用PyTorch的优化器来更新net中的参数\n",
    "import torch.optim as optim\n",
    "\n",
    "# create your optimizer\n",
    "optimizer = optim.SGD(net.parameters(), lr=0.01)\n",
    "\n",
    "# 在每次循环中应该做的事:\n",
    "optimizer.zero_grad()   # 把梯度清零\n",
    "output = net(input) # 进行一次前向传播\n",
    "loss = criterion(output, target) # 计算误差\n",
    "loss.backward() # 后向传播\n",
    "optimizer.step()    # 参数更新"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50be2d1f",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第二部分:实验内容</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba5214e5",
   "metadata": {},
   "source": [
    "[Red Wine Quality](https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009)是一个关于红酒品质的数据集，总共有1599个样本，每个样本包含11个(都是连续的)特征以及1个标签，每个标签的取值是连续的。本次实验已经按照8：2的比例划分成了训练数据集'wine_train.csv'以及测试数据集'wine_test.csv'，且每个数据集都已经做了归一化处理。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4d9ccac",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 读入训练数据集'wine_train.csv'与测试数据集'wine_test.csv'。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c759660f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# -- Your code here --\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d77619fd",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2) 利用线性层和激活函数搭建一个神经网络，要求输入和输出维度与数据集维度一致，而神经网络深度、隐藏层大小、激活函数种类等超参数自行调整。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26359bae",
   "metadata": {},
   "outputs": [],
   "source": [
    "# -- Your code here --\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b706eea",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 用梯度下降法进行模型参数更新，记下每轮迭代中的训练损失和测试损失。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cad2f094",
   "metadata": {},
   "outputs": [],
   "source": [
    "# -- Your code here --\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ff5ff45e",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4) 画出训练损失和测试损失关于迭代轮数的折线图。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3ba3071f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# -- Your code here --\n",
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}