{
"cells": [
{
"cell_type": "markdown",
"id": "e59a56ee",
"metadata": {},
"source": [
"**实验四:神经网络中的前向传播与后向传播**"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "62f100c9",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"id": "eebaf0fe",
"metadata": {},
"source": [
"**第一部分:PyTorch介绍**"
]
},
{
"cell_type": "markdown",
"id": "2a394db6",
"metadata": {},
"source": [
"这里介绍一小部分PyTorch常用的库和函数,更多需求可参阅[PyTorch官方教程](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)以及[PyTorch官方文档](https://pytorch.org/docs/stable/index.html)。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "6fae149b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.1.0+cpu\n"
]
}
],
"source": [
"import torch # 导入的是 torch 而不是 pytorch\n",
"print(torch.__version__) # 输出当前pytorch的版本"
]
},
{
"cell_type": "markdown",
"id": "392780bb",
"metadata": {},
"source": [
"1.Tensor"
]
},
{
"cell_type": "markdown",
"id": "904623e4",
"metadata": {},
"source": [
"Tensor与NumPy中的ndarray很相似,但Tensor可以利用GPU来加速计算(虽然本门课不用)。"
]
},
{
"cell_type": "markdown",
"id": "23642996",
"metadata": {},
"source": [
"1.1. Tensor的创建"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "38cdf3ad",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[-2.9818e-08, 1.8357e-42, 0.0000e+00],\n",
" [ 0.0000e+00, 0.0000e+00, 0.0000e+00]])\n",
"tensor([[1, 2, 3],\n",
" [4, 5, 6]])\n",
"tensor([[0.6417, 0.6556, 0.7544, 0.3146],\n",
" [0.3440, 0.0655, 0.0828, 0.5430],\n",
" [0.5535, 0.4441, 0.1285, 0.8876]])\n",
"tensor([[0., 0., 0.],\n",
" [0., 0., 0.]])\n",
"tensor([[1., 1., 1.],\n",
" [1., 1., 1.]])\n"
]
}
],
"source": [
"# 创建一个未初始化的Tensor\n",
"x = torch.empty(2, 3)\n",
"print(x)\n",
"\n",
"# 从一个列表创建Tensor\n",
"x = torch.tensor([[1,2,3],[4,5,6]])\n",
"print(x)\n",
"\n",
"# 创建一个随机Tensor\n",
"x = torch.rand([3, 4])\n",
"print(x)\n",
"\n",
"# 创建一个全零Tensor\n",
"x = torch.zeros([2, 3])\n",
"print(x)\n",
"\n",
"# 创建一个全一Tensor\n",
"x = torch.ones([2, 3])\n",
"print(x)"
]
},
{
"cell_type": "markdown",
"id": "9f354fd4",
"metadata": {},
"source": [
"1.2. Tensor的运算"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8070a903",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[7, 7, 7],\n",
" [7, 7, 7]])\n",
"tensor([[-5, -3, -1],\n",
" [ 1, 3, 5]])\n",
"tensor([[ 6, 10, 12],\n",
" [12, 10, 6]])\n",
"tensor([[ 6, 10, 12],\n",
" [12, 10, 6]])\n",
"tensor([[28, 10],\n",
" [73, 28]])\n",
"tensor([[28, 10],\n",
" [73, 28]])\n",
"tensor([[1, 2],\n",
" [3, 4],\n",
" [5, 6]])\n",
"tensor([[1, 2, 3],\n",
" [4, 5, 6],\n",
" [6, 5, 4],\n",
" [3, 2, 1]])\n",
"tensor([[1, 2, 3, 6, 5, 4],\n",
" [4, 5, 6, 3, 2, 1]])\n"
]
}
],
"source": [
"# 加减法\n",
"x = torch.tensor([[1,2,3],[4,5,6]])\n",
"y = torch.tensor([[6,5,4],[3,2,1]])\n",
"print(x + y)\n",
"print(x - y)\n",
"\n",
"# 对应位置相乘\n",
"print(x * y)\n",
"print(x.mul(y))\n",
"\n",
"# 矩阵乘法\n",
"print(x.matmul(y.T))\n",
"print(x @ y.T)\n",
"\n",
"# reshape\n",
"print(x.reshape(3, 2))\n",
"\n",
"# 拼接\n",
"print(torch.cat([x,y], dim=0)) # 纵向拼接\n",
"print(torch.cat([x,y], dim=1)) # 横向拼接"
]
},
{
"cell_type": "markdown",
"id": "95961cee",
"metadata": {},
"source": [
"1.3. Tensor与ndarray的相互转换"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "d1269212",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[1, 2, 3],\n",
" [4, 5, 6]])\n",
"[[1 2 3]\n",
" [4 5 6]]\n",
"[[0 0 0]\n",
" [0 0 0]]\n",
"tensor([[0, 0, 0],\n",
" [0, 0, 0]])\n"
]
}
],
"source": [
"x = torch.tensor([[1,2,3],[4,5,6]])\n",
"print(x)\n",
"\n",
"# 从Tensor转换到ndarray\n",
"y = x.numpy()\n",
"print(y)\n",
"\n",
"# Tensor与ndarray是共享空间的\n",
"x[:]=0\n",
"print(y)\n",
"\n",
"# 从ndarray到Tensor\n",
"z = torch.from_numpy(y)\n",
"print(z)"
]
},
{
"cell_type": "markdown",
"id": "bb6c36a7",
"metadata": {},
"source": [
"2.自动求梯度"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "3b30fec3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[3., 4.]]) tensor(1.)\n",
"tensor([0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966,\n",
" 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966, 0.1966,\n",
" 0.1966, 0.1966])\n",
"tensor(4.)\n",
"tensor(5.)\n",
"tensor(0.)\n"
]
}
],
"source": [
"a = torch.tensor([[1.,2.]], requires_grad=True) # 把requires_grad设为True, 开始跟踪基于它的所有运算\n",
"b = torch.tensor([[3.],[4.]])\n",
"c = torch.tensor(5., requires_grad=True)\n",
"y = a @ b + c\n",
"y.backward() #自动计算梯度\n",
"print(a.grad, c.grad) #输出叶子节点a和c的梯度\n",
"\n",
"# 可支持多种运算求梯度,如torch.mean(),torch.sum()等\n",
"a = torch.ones(20, requires_grad=True)\n",
"z = torch.sum(torch.sigmoid(a))\n",
"z.backward()\n",
"print(a.grad)\n",
"\n",
"# 多次求梯度时梯度会累加,可使用tensor.grad.zero_()进行手动清零\n",
"x = torch.tensor(2., requires_grad=True)\n",
"y = x ** 2\n",
"y.backward()\n",
"print(x.grad)\n",
"z = x + 3\n",
"z.backward()\n",
"print(x.grad)\n",
"x.grad.zero_()\n",
"print(x.grad)"
]
},
{
"cell_type": "markdown",
"id": "96d22808",
"metadata": {},
"source": [
"3. 神经网络(官方教程中的例子)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "207b6433",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Net(\n",
" (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))\n",
" (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\n",
" (fc1): Linear(in_features=400, out_features=120, bias=True)\n",
" (fc2): Linear(in_features=120, out_features=84, bias=True)\n",
" (fc3): Linear(in_features=84, out_features=10, bias=True)\n",
")\n"
]
}
],
"source": [
"import torch\n",
"import torch.nn as nn\n",
"import torch.nn.functional as F\n",
"\n",
"# 继承自nn.Module\n",
"class Net(nn.Module):\n",
"\n",
" def __init__(self):\n",
" super(Net, self).__init__()\n",
" # 卷积层\n",
" # 1 input image channel, 6 output channels, 5x5 square convolution\n",
" # kernel\n",
" self.conv1 = nn.Conv2d(1, 6, 5)\n",
" self.conv2 = nn.Conv2d(6, 16, 5)\n",
" # an affine operation: y = Wx + b\n",
" # nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)\n",
" # 其中in_features表示有多少个输入,out_features表示该层有多少个神经元\n",
" self.fc1 = nn.Linear(16 * 5 * 5, 120) # 5*5 from image dimension\n",
" self.fc2 = nn.Linear(120, 84)\n",
" self.fc3 = nn.Linear(84, 10)\n",
"\n",
" def forward(self, x):\n",
" # Max pooling over a (2, 2) window\n",
" x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))\n",
" # If the size is a square, you can specify with a single number\n",
" x = F.max_pool2d(F.relu(self.conv2(x)), 2)\n",
" x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension\n",
" x = F.relu(self.fc1(x))\n",
" x = F.relu(self.fc2(x))\n",
" x = self.fc3(x)\n",
" return x\n",
"\n",
"\n",
"net = Net()\n",
"print(net)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "6c2a943f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10\n",
"torch.Size([6, 1, 5, 5])\n"
]
}
],
"source": [
"# 该神经网络中可学习的参数可以通过net.parameters()访问\n",
"params = list(net.parameters())\n",
"print(len(params))\n",
"print(params[0].size()) # conv1's .weight"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b733263e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[-0.0647, 0.0301, 0.0526, -0.1407, 0.0690, -0.0806, -0.0420, 0.1001,\n",
" 0.0486, 0.1200]], grad_fn=)\n"
]
}
],
"source": [
"# 随机生成一个输入送入net中,除了第0维是样本个数外,其余维度要与forward()参数中x的维度对应上\n",
"input = torch.randn(1, 1, 32, 32) # 1个样本,该样本是有1个通道的32×32的图像\n",
"out = net(input) # 进行一次forward()前向传播\n",
"print(out)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "168659b3",
"metadata": {},
"outputs": [],
"source": [
"net.zero_grad() # 手动清零神经网络中参数的梯度\n",
"out.backward(torch.randn(1, 10)) # 用随机梯度进行一次backward()后向传播来计算梯度"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "af35f14f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor(0.5634, grad_fn=)\n"
]
}
],
"source": [
"output = net(input)\n",
"target = torch.randn(10) # a dummy target, for example\n",
"target = target.view(1, -1) # make it the same shape as output\n",
"# nn模块提供了许多种类的损失函数,如nn.CrossEntropyLoss()、nn.MSELoss()等等\n",
"criterion = nn.MSELoss()\n",
"\n",
"loss = criterion(output, target)\n",
"print(loss)"
]
},
{
"cell_type": "markdown",
"id": "1428f7ce",
"metadata": {},
"source": [
"计算图如下:\n",
"input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d\n",
" -> flatten -> linear -> relu -> linear -> relu -> linear\n",
" -> MSELoss\n",
" -> loss"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "e9ada5be",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\n"
]
}
],
"source": [
"# 查看计算图中的函数\n",
"print(loss.grad_fn) # MSELoss\n",
"print(loss.grad_fn.next_functions[0][0]) # Linear\n",
"print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "854ee4d1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"conv1.bias.grad before backward\n",
"None\n",
"conv1.bias.grad after backward\n",
"tensor([ 0.0008, -0.0093, -0.0016, 0.0013, 0.0035, -0.0099])\n"
]
}
],
"source": [
"net.zero_grad() # zeroes the gradient buffers of all parameters\n",
"\n",
"print('conv1.bias.grad before backward')\n",
"print(net.conv1.bias.grad)\n",
"\n",
"# 进行一次后向传播\n",
"loss.backward()\n",
"\n",
"print('conv1.bias.grad after backward')\n",
"print(net.conv1.bias.grad)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "d7187507",
"metadata": {},
"outputs": [],
"source": [
"# 用梯度下降法(手动)更新net中的参数\n",
"learning_rate = 0.01\n",
"for f in net.parameters():\n",
" f.data.sub_(f.grad.data * learning_rate)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "06d237aa",
"metadata": {},
"outputs": [],
"source": [
"# 用PyTorch的优化器来更新net中的参数\n",
"import torch.optim as optim\n",
"\n",
"# create your optimizer\n",
"optimizer = optim.SGD(net.parameters(), lr=0.01)\n",
"\n",
"# 在每次循环中应该做的事:\n",
"optimizer.zero_grad() # 把梯度清零\n",
"output = net(input) # 进行一次前向传播\n",
"loss = criterion(output, target) # 计算误差\n",
"loss.backward() # 后向传播\n",
"optimizer.step() # 参数更新"
]
},
{
"cell_type": "markdown",
"id": "50be2d1f",
"metadata": {},
"source": [
"**第二部分:实验内容**"
]
},
{
"cell_type": "markdown",
"id": "ba5214e5",
"metadata": {},
"source": [
"[Red Wine Quality](https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009)是一个关于红酒品质的数据集,总共有1599个样本,每个样本包含11个(都是连续的)特征以及1个标签,每个标签的取值是连续的。本次实验已经按照8:2的比例划分成了训练数据集'wine_train.csv'以及测试数据集'wine_test.csv',且每个数据集都已经做了归一化处理。"
]
},
{
"cell_type": "markdown",
"id": "d4d9ccac",
"metadata": {},
"source": [
"1) 读入训练数据集'wine_train.csv'与测试数据集'wine_test.csv'。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c759660f",
"metadata": {},
"outputs": [],
"source": [
"# -- Your code here --\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "d77619fd",
"metadata": {},
"source": [
"2) 利用线性层和激活函数搭建一个神经网络,要求输入和输出维度与数据集维度一致,而神经网络深度、隐藏层大小、激活函数种类等超参数自行调整。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "26359bae",
"metadata": {},
"outputs": [],
"source": [
"# -- Your code here --\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "9b706eea",
"metadata": {},
"source": [
"3) 用梯度下降法进行模型参数更新,记下每轮迭代中的训练损失和测试损失。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cad2f094",
"metadata": {},
"outputs": [],
"source": [
"# -- Your code here --\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "ff5ff45e",
"metadata": {},
"source": [
"4) 画出训练损失和测试损失关于迭代轮数的折线图。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3ba3071f",
"metadata": {},
"outputs": [],
"source": [
"# -- Your code here --\n",
"\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}