Participate in the 12th day of The November Gwen Challenge, see the details of the event: 2021 Last Gwen Challenge

The multilayer perceptron structure to be implemented is this two-layer structure:

import torch
from torch import nn
from d2l import torch as d2l
Copy the code
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
Copy the code

There should be no explanation here. For those of you who have read my previous articles on Hands-on Deep Learning, you will know that you set the mini-Batch batch size to 256 and then load the training set and test set of the fashion mnist dataset.

A user warning appears, which was also written in manual implementation of Softmax. No repetitions.

num_inputs, num_outputs, num_hiddens = 784.10.256

W1 = nn.Parameter(torch.randn(
    num_inputs, num_hiddens, requires_grad=True) * 0.01)
b1 = nn.Parameter(torch.zeros(num_hiddens, requires_grad=True))
W2 = nn.Parameter(torch.randn(
    num_hiddens, num_outputs, requires_grad=True) * 0.01)
b2 = nn.Parameter(torch.zeros(num_outputs, requires_grad=True))

params = [W1, b1, W2, b2]
Copy the code
  • First, set the size of input layer, hidden layer and output layer respectively.
    • We said that each image of the dataset is 28 by 28, so the input vector is 28 by 28=784
    • Here we set up a multi-layer perceptron with a single hiding layer, which contains 256 hiding units
    • The output layer vector size is 10, because the image is divided into ten classes.
  • Then initialize the weights and bias for each layer.nn.ParameterYou can add it or not. I didn’t add it before.
def relu(X) :
    a = torch.zeros_like(X)
    return torch.max(X, a)
Copy the code

The activation function is ReLU, not sigmoID or something like that. Common activation function – Digging gold (juejin. Cn)

def net(X) :
    X = X.reshape((-1, num_inputs))
    H = relu(X@W1 + b1)  # where "@" stands for matrix multiplication
    return (H@W2 + b2)
Copy the code

Set up the network.

  • Let’s deal with X first.
  • The multiplication notation uses the@, can be seen here:Various multiplications in PyTorch – Nuggets (juejin. Cn).
loss = nn.CrossEntropyLoss()
Copy the code

Here directly use cross entropy loss, will not repeat the wheel, interested to see hands-on deep learning 3.6- manual softmax regression – digging gold (juejin. Cn), there is written how to achieve cross entropy loss.

num_epochs, lr = 10.0.1
updater = torch.optim.SGD(params, lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
d2l.predict_ch3(net, test_iter)
Copy the code
  • Set the number of iterations and learning rate of training
  • Set to optimize
  • d2l.train_ch3training
  • d2l.predict_ch3To evaluate the model we learned, weApply the model on some test data.