PyTorch: nn¶

创建于: Dec 03, 2020 | 最后更新于: Jun 14, 2022 | 最后验证于: Nov 05, 2024

一个三阶多项式，训练用于通过最小化欧几里得平方距离来预测从 \(-\pi\) 到 \(pi\) 的 \(y=\sin(x)\)。

此实现使用 PyTorch 的 nn 包来构建网络。PyTorch 的 autograd 使定义计算图和计算梯度变得容易，但对于定义复杂的神经网络而言，原始的 autograd 可能有点太底层了；这时 nn 包就能提供帮助。nn 包定义了一组模块 (Modules)，您可以将其视为神经网络层，它从输入产生输出，并且可能带有一些可训练的权重。

240.86276245117188
167.7965850830078
117.86255645751953
83.69757843017578
60.295021057128906
44.246177673339844
33.22774887084961
25.654367446899414
20.443069458007812
16.853185653686523
14.377497673034668
12.668401718139648
11.48726749420166
10.670132637023926
10.104259490966797
9.712006568908691
9.439836502075195
9.250818252563477
9.119422912597656
9.028005599975586
Result: y = 0.013691592030227184 + 0.8503277897834778 x + -0.0023620266001671553 x^2 + -0.09241818636655807 x^3

import torch
import math


# Create Tensors to hold input and outputs.
x = torch.linspace(-math.pi, math.pi, 2000)
y = torch.sin(x)

# For this example, the output y is a linear function of (x, x^2, x^3), so
# we can consider it as a linear layer neural network. Let's prepare the
# tensor (x, x^2, x^3).
p = torch.tensor([1, 2, 3])
xx = x.unsqueeze(-1).pow(p)

# In the above code, x.unsqueeze(-1) has shape (2000, 1), and p has shape
# (3,), for this case, broadcasting semantics will apply to obtain a tensor
# of shape (2000, 3)

# Use the nn package to define our model as a sequence of layers. nn.Sequential
# is a Module which contains other Modules, and applies them in sequence to
# produce its output. The Linear Module computes output from input using a
# linear function, and holds internal Tensors for its weight and bias.
# The Flatten layer flatens the output of the linear layer to a 1D tensor,
# to match the shape of `y`.
model = torch.nn.Sequential(
    torch.nn.Linear(3, 1),
    torch.nn.Flatten(0, 1)
)

# The nn package also contains definitions of popular loss functions; in this
# case we will use Mean Squared Error (MSE) as our loss function.
loss_fn = torch.nn.MSELoss(reduction='sum')

learning_rate = 1e-6
for t in range(2000):

    # Forward pass: compute predicted y by passing x to the model. Module objects
    # override the __call__ operator so you can call them like functions. When
    # doing so you pass a Tensor of input data to the Module and it produces
    # a Tensor of output data.
    y_pred = model(xx)

    # Compute and print loss. We pass Tensors containing the predicted and true
    # values of y, and the loss function returns a Tensor containing the
    # loss.
    loss = loss_fn(y_pred, y)
    if t % 100 == 99:
        print(t, loss.item())

    # Zero the gradients before running the backward pass.
    model.zero_grad()

    # Backward pass: compute gradient of the loss with respect to all the learnable
    # parameters of the model. Internally, the parameters of each Module are stored
    # in Tensors with requires_grad=True, so this call will compute gradients for
    # all learnable parameters in the model.
    loss.backward()

    # Update the weights using gradient descent. Each parameter is a Tensor, so
    # we can access its gradients like we did before.
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

# You can access the first layer of `model` like accessing the first item of a list
linear_layer = model[0]

# For linear layer, its parameters are stored as `weight` and `bias`.
print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item()} x + {linear_layer.weight[:, 1].item()} x^2 + {linear_layer.weight[:, 2].item()} x^3')

脚本总运行时间: ( 0 分钟 0.430 秒)

由 Sphinx-Gallery 生成的画廊

PyTorch: nn¶

文档

教程

资源