The original link: mp.weixin.qq.com/s/WZdBm2JQ4…

This is a translation from the official introductory tutorial at the following address:

DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ

Although the name of the tutorial is 60 minutes Introduction, there is still a lot of content. Although there are many updated articles with nearly 10,000 words, this time it is divided into about 4 articles. This is the first one, and the contents are as follows:


1. What is a Pytorch

Pytorch is a Python-based scientific computing library aimed at two groups of people:

  • To take advantage of the power of GPUs instead of Numpy;
  • A deep learning research platform that can provide more flexibility and speed.

1.1 installation

Pytorch installation can be found at pytorch.org/get-started…

Select the system (Linux, Mac, or Windows), installation method (Conda, Pip, LibTorch, or source installation), programming language (Python 2.7 or Python 3.5,3.6,3.7, or C++), If it is a GPU version, you need to select the CUDA version, so if you select it as shown above, the installation command is:

Conda install PyTorch TorchVision CudatoolKit = 9.0-C PytorchCopy the code

The Conda installation is recommended here, that is, using Anaconda. The main reason is that you can set up different Settings for different environments. For Anaconda, see the introduction to Python basics and Environment configuration I wrote earlier.

Of course, the latest version of Pytorch will be installed here, which is version 1.1. If you want to install the previous version, you can click on the following url:

Pytorch.org/get-started…

Install PyTorch 0.4.1 with different CUDA versions and without CUDA, as shown below.

Then there are other installation methods, you can click to view.

After installation, type the following command:

from __future__ import print_function
import torch
x = torch.rand(5.3)
print(x)
Copy the code

The installation is successful if the following information is displayed:

tensor([[0.3380.0.3845.0.3217],
        [0.8337.0.9050.0.2650],
        [0.2979.0.7141.0.9069],
        [0.1449.0.1132.0.1375],
        [0.4675.0.3947.0.1426]])
Copy the code

Cuda.is_available () is used to check whether the current GPU graphics card can be used, if True, of course it can be run, otherwise it cannot.

import torch
torch.cuda.is_available()
Copy the code

1.2 tensor (Tensors)

Pytorch can replace Numpy library, so first introduce Tensors, also known as Tensors, which are equivalent to Numpy multidimensional arrays (Ndarrays). The difference between the two is that Tensors can be applied to gpus to speed up calculations.

First import the necessary libraries, mainly Torch

from __future__ import print_function
import torch
Copy the code
1.2.1 Declaration and Definition

Firstly, the statement and definition of Tensors are as follows:

  • Torch. Empty (): Declares an uninitialized matrix.
Create a 5 by 3 matrix
x = torch.empty(5.3)
print(x)
Copy the code

The following output is displayed:

tensor([[9.2737 e-41.8.9074 e-01.1.9286 e-37],
        [1.7228 e-34.5.7064 e+01.9.2737 e-41],
        [2.2803 e+02.1.9288 e-37.1.7228 e-34],
        [1.4609 e+04.9.2737 e-41.5.8375 e+04],
        [1.9290 e-37.1.7228 e-34.3.7402 e+06]])
Copy the code
  • Torch. Rand () : Randomly initializes a matrix
Create a randomly initialized 5*3 matrix
rand_x = torch.rand(5.3)
print(rand_x)
Copy the code

Output result:

tensor([[0.4311.0.2798.0.8444],
        [0.0829.0.9029.0.8463],
        [0.7139.0.4225.0.5623],
        [0.7642.0.0329.0.8816],
        [1.0000.0.9830.0.9256]])
Copy the code
  • Torch. Zeros () : Creates a matrix with zero values
Create a matrix of type long and value 0
zero_x = torch.zeros(5.3, dtype=torch.long)
print(zero_x)
Copy the code

The following output is displayed:

tensor([[0.0.0],
        [0.0.0],
        [0.0.0],
        [0.0.0],
        [0.0.0]])
Copy the code

Similarly, you can create matrices with values of 1 by calling torch. Ones

  • Torch. Tensor () : Create it by passing the tensor value directly
# tensor is [5.5, 3]
tensor1 = torch.tensor([5.5.3])
print(tensor1)
Copy the code

Output result:

tensor([5.5000.3.0000])
Copy the code

You can create new tensor variables from your existing tensor variables. The good thing about this is that you can keep some of your existing tensor properties, your dimension, your numerical properties, unless you have to redefine them. The corresponding implementation method is as follows:

  • Tensor. New_ones () : the new_*() method needs to input the dimensions
The new size is 5*3 and the value type is torch. Double
tensor2 = tensor1.new_ones(5.3, dtype=torch.double)  The # new_* method needs to enter the tensor size
print(tensor2)
Copy the code

Output result:

tensor([[1..1..1.],
        [1..1..1.],
        [1..1..1.],
        [1..1..1.],
        [1..1..1.]], dtype=torch.float64)
Copy the code
  • Torch. Randn_like (old_tensor) : Keep the same dimensions
# Change the numeric type
tensor3 = torch.randn_like(tensor2, dtype=torch.float)
print('tensor3: ', tensor3)
Copy the code

Tensor2 = tensor2; tensor2 = tensor2; tensor2 = tensor2;

tensor3:  tensor([[0.4491.0.2634.0.0040],
        [0.1624.0.4475.0.8407],
        [0.6539.1.2772.0.6060],
        [ 0.2304.0.0879.0.3876],
        [ 1.2900.0.7475.1.8212]])
Copy the code

Finally, tensor. Size () can be used to obtain the dimensions of tensors:

print(tensor3.size())  
Torch.Size([5, 3])
Copy the code

Note that torch.Size is actually a tuple, so all tuple operations are supported.

1.2.2 operation (Operations)

Operations also include a lot of syntax, but as a quick start, we’ll just use addition as an example. For more information about operations, see the official documentation below, including transpose, indexing, slicing, math, linear algebra, random numbers, and more:

Pytorch.org/docs/stable…

For addition operations, there are several implementations:

  • +
  • torch.add(tensor1, tensor2, [out=tensor3])
  • Tensor1.add_ (tensor2) : Modify the tensor variables directly
tensor4 = torch.rand(5.3)
print('tensor3 + tensor4= ', tensor3 + tensor4)
print('tensor3 + tensor4= ', torch.add(tensor3, tensor4))
A tensor variable holds the result of the addition operation
result = torch.empty(5.3)
torch.add(tensor3, tensor4, out=result)
print('add result= ', result)
# Change variables directly
tensor3.add_(tensor4)
print('tensor3= ', tensor3)
Copy the code

The output

tensor3 + tensor4=  tensor([[ 0.1000.0.1325.0.0461],
        [ 0.4731.0.4523.0.7517],
        [ 0.2995.0.9576.1.4906],
        [ 1.0461.0.7557.0.0187],
        [ 2.2446.0.3473.1.0873]])

tensor3 + tensor4=  tensor([[ 0.1000.0.1325.0.0461],
        [ 0.4731.0.4523.0.7517],
        [ 0.2995.0.9576.1.4906],
        [ 1.0461.0.7557.0.0187],
        [ 2.2446.0.3473.1.0873]])

add result=  tensor([[ 0.1000.0.1325.0.0461],
        [ 0.4731.0.4523.0.7517],
        [ 0.2995.0.9576.1.4906],
        [ 1.0461.0.7557.0.0187],
        [ 2.2446.0.3473.1.0873]])

tensor3=  tensor([[ 0.1000.0.1325.0.0461],
        [ 0.4731.0.4523.0.7517],
        [ 0.2995.0.9576.1.4906],
        [ 1.0461.0.7557.0.0187],
        [ 2.2446.0.3473.1.0873]])
Copy the code

Note that operations that can change the tensor variables have a suffix _, like x.popy_ (y), x.t._ () can change the X variable

Except for addition, the Tensor accesses, just like Numpy accesses arrays, you can use indexes to access one dimension of data, as follows:

Access tensor3 first column data
print(tensor3[:, 0])
Copy the code

Output result:

tensor([0.1000.0.4731.0.2995.1.0461.2.2446])
Copy the code

To change the dimension of the Tensor, you can use torch. View () as follows:

x = torch.randn(4.4)
y = x.view(16)
# -1 represents the product of all dimensions except the given one
z = x.view(- 1.8)
print(x.size(), y.size(), z.size())
Copy the code

Output result:

torch.Size([4.4]) torch.Size([16]) torch.Size([2.8])
Copy the code

If the tensor only has one element, you can use.item() to get a Python equivalent of an integer:

x = torch.randn(1)
print(x)
print(x.item())
Copy the code

Output result:

tensor([0.4549])
0.4549027979373932
Copy the code

More operations can be found in the official documentation:

Pytorch.org/docs/stable…

1.3 and Numpy array conversion

The Tensor and Numpy arrays can be converted to each other, and then they share the memory space under the CPU, so if you change the value of one, you change the value of the other variable.

1.3.1 Converting tensors to Numpy arrays

A sample of how you can translate a Tensor into a Numpy array is shown below. You can do this by calling tensor.numpy().

a = torch.ones(5)
print(a)
b = a.numpy()
print(b)
Copy the code

Output result:

tensor([1., 1., 1., 1., 1.])
[1. 1. 1. 1. 1.]
Copy the code

And then again, we just said that they share the same memory space, so here’s an example, modify the tensor variable A, and see if the Numpy variable B that you get from A changes.

a.add_(1)
print(a)
print(b)
Copy the code

The output is as follows, and it is clear that B changes as A changes.

tensor([2..2..2..2..2.[])2. 2. 2. 2. 2.]
Copy the code
1.3.2 Conversion of Numpy arrays to tensors

The operation of the transformation is to call the torch. From_numpy (numpy_array) method. Here is an example:

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)
Copy the code

Output result:

[2. 2. 2. 2. 2.]
tensor([2..2..2..2..2.], dtype=torch.float64)
Copy the code

On the CPU, all of the Tensor types, except CharTensor, support conversion to and from Numpy arrays.

1.4. CUDA tensor

Tensors can be converted to different devices, i.e. CPUS or gpus, using the. To method. Here is an example:

When CUDA is available, run the following code to change whether Tensors perform calculations on the GPU using the torch. Device () method
if torch.cuda.is_available():
    device = torch.device("cuda")          Define a CUDA device object
    y = torch.ones_like(x, device=device)  Show a tensor created on the GPU
    x = x.to(device)                       .to("cuda")
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       The #.to() method can also change numeric types
Copy the code

The first result is the result on the GPU, the variable is printed with device=’ CUDA :0′, and the second is the variable on the CPU.

tensor([1.4549], device='cuda:0')

tensor([1.4549], dtype=torch.float64)
Copy the code

This section tutorial:

Pytorch.org/tutorials/b…

The code for this section:

Github.com/ccc013/Deep…

2. autograd

For Pytorch’s neural network, a key library is Autograd, which provides automatic differential function for all operations on Tensors, that is, to calculate gradients. It belongs to the define-by-run type framework, where the back-propagation operation is defined in terms of how the code runs, so each iteration can be different.

Here are some brief examples to illustrate the library’s usefulness.

2.1 tensor

Torch.Tensor is Pytorch’s main library, and when you set its attribute to.requires_grad=True, you start tracking all operations on that variable, and when you’re done, you can call.backward() and compute all gradients automatically, The resulting gradients are stored in the attribute.grad.

Call.detach() to separate out the history of the calculation. You can stop a tensor variable from tracking its history and prevent future calculations from being traced.

If you want to prevent history tracking (and memory usage), you can place the code block with torch.no_grad(): This is useful when evaluating with a model that contains training parameters with REQUIres_grad =True, but does not actually require their gradient information.

There is another class that is also very important for the implementation of Autograd — Function.

The Tensor and Function classes are related and create an acyclic graph that encodes a complete calculation. Each tensor variable has the.grad_fn attribute, which references the Function that created the variable (except for Tensors created by the user, for which grad_fn=None).

If you want to take derivatives, you can call a Tensor variable method.backward(). If the variable is a scalar, that is, there is only one element, then there is no need to pass any parameters to the method.Backward (). When multiple elements are involved, a gradient parameter must be specified, indicating a tensor with matching dimensions. See section 2 for gradient.

Let’s go further with the code.

First import the necessary libraries:

import torch
Copy the code

Start creating a tensor and make Requires_grad =True to track the calculations associated with that variable:

x = torch.ones(2.2, requires_grad=True)
print(x)
Copy the code

Output result:

tensor([[1..1.],
        [1..1.]], requires_grad=True)
Copy the code

To perform any computation, here’s a simple addition:

y = x + 2
print(y)
Copy the code

Output result:

tensor([[3..3.],
        [3..3.]], grad_fn=<AddBackward>)
Copy the code

Y is the result of an operation, so it has the property grad_fn:

print(y.grad_fn)
Copy the code

Output result:

<AddBackward object at 0x00000216D25DCC88>
Copy the code

Continue to operate on variable y:

z = y * y * 3
out = z.mean()

print('z=', z)
print('out=', out)
Copy the code

Output result:

z= tensor([[27..27.],
        [27..27.]], grad_fn=<MulBackward>)

out= tensor(27., grad_fn=<MeanBackward1>)
Copy the code

In fact, the default for a Tensor variable requires_grad is False, so you can define a variable and specify True, or you can define a variable and call.requires_grad_(True) and set it to True. Adding the suffix _ changes the attributes of the variable itself, as the addition operation add_() explained in the previous section. Here is an example of code:

a = torch.randn(2.2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)
Copy the code

The first line is the result of requires_grad, followed by the call to.requires_grad_(True), which produces True.

False

True

<SumBackward0 object at 0x00000216D25ED710>
Copy the code

2.2 the gradient

The next step is to calculate the gradient and do the back-propagation operation. The out variable is defined in the previous section and is a scalar, so out.backward() equals out.backward(torch.tensor(1.)), as follows:

out.backward()
The output gradient d(out)/dx
print(x.grad)
Copy the code

Output result:

tensor([[4.5000.4.5000],
        [4.5000.4.5000]])
Copy the code

The result should be matrices with values of 4.5. Here we use o to represent the out variable, so according to the previous definition:


In detail, the initial definition of x is a matrix of all ones, then the addition of x+2 to get y, then y*y*3 to get Z, and z is a 2*2 matrix, so the whole average to get out variable should be divided by 4, so the above three formulas.

Therefore, calculate the gradient:


Mathematically, if you have a vector-valued function:


Then the corresponding gradient is a Jacobian matrix:


In general, torch. Autograd is a tool used to compute vector-Jacobian products. Here skip the mathematical formula, directly on the code example introduction:

x = torch.randn(3, requires_grad=True)

y = x * 2
while y.data.norm() < 1000:
    y = y * 2

print(y)
Copy the code

Output result:

tensor([ 237.5009.1774.2396.274.0625], grad_fn=<MulBackward>)
Copy the code

The variable y obtained here is no longer a scalar, and torch. Autograd cannot directly compute the full Jacobian determinant, but we can get the product of Jacobian vectors by simply passing vectors to backward() as parameters, as shown in the following example:

v = torch.tensor([0.1.1.0.0.0001], dtype=torch.float)
y.backward(v)

print(x.grad)
Copy the code

Output result:

tensor([ 102.4000.1024.0000.0.1024])
Copy the code

Finally, add with torch.no_grad() to stop tracking variable history for automatic gradient calculation:

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)
Copy the code

Output result:

True

True

False
Copy the code

More on Autograd and Function:

Pytorch.org/docs/autogr…

This section tutorial:

Pytorch.org/tutorials/b…

The code for this section:

Github.com/ccc013/Deep…


summary

The first article briefly introduces Pytorch as a replacement for Numpy and a new deep learning tool, which has allowed it to grow rapidly in just two or three years. Due to Tensorflow’s shortcomings, More and more people are choosing Pytorch. Especially for academic researchers, Pytorch is actually faster.

In addition, also introduced the most important is the most basic tensors knowledge, its method, operation and Numpy array is very similar, they can also be converted to each other, slightly different tensors can be applied to the GPU to speed up the calculation.

Finally, I briefly introduced the library Autograd, which is very important for deep learning. It can automatically calculate gradients, which is very useful.

Welcome to follow my wechat official account – Machine Learning and Computer Vision, or scan the qr code below, we can communicate, learn and progress together!

Past wonderful recommendation

Machine learning series
  • Beginners of machine learning actual combat tutorial!
  • Model evaluation, over-fitting, under-fitting and hyperparameter tuning methods
  • Summary and Comparison of Commonly used Machine Learning Algorithms
  • Summary and Comparison of Common Machine Learning Algorithms (PART 1)
  • How to Build a Complete Machine Learning Project
  • Data Preprocessing for feature Engineering (PART 1)
  • Learn about eight applications of computer vision
Github projects & Resource tutorials recommended
  • [Github Project recommends] a better site for reading and finding papers
  • TensorFlow is now available in Chinese
  • Must-read AI and Deep learning blog
  • An easy-to-understand TensorFlow tutorial
  • Recommend some Python books and tutorials, both beginner and advanced!
  • [Github project recommendation] Machine learning & Python
  • [Github Project Recommendations] Here are three tools to help you get the most out of Github
  • Github provides information about universities and foreign open course videos
  • Did you pronounce all these words correctly? Incidentally recommend three programmers exclusive English tutorial!