The tall and handsome little M came to the door of the second close, the jubilant little M did not know he would have what kind of encounter here

The code is shown in: Pytorch -tutorial/vgg_model.py

Backbone-vggx three options are available for feature extraction: VGG-13, VGG-16, and VGG-19. Little M selects VGG-16 in the middle

After that, a line of text appears stating that the level is made up of VGG-16 without the final pooling and subsequent full connection layers, and its structure is as follows

        print(images.image_list.shape)                    Torch.Size([1, 3, 608, 1088])
        # Input the image to backbone to get the feature map
        feature_maps = self.backbone(images.image_list)
        print(feature_maps.shape)                         # output torch.Size([1, 512, 38, 68])
Copy the code

Small M looking at the 3 d tall and handsome oneself, calculate oneself after Backbone will become what appearance

In VGG-16, each convolution layer is configured as

kernel_size = 3
stride = 1
padding = 1
Copy the code

This means that the width and height remain the same through the convolution kernel (W−3+21+1=W{W – 3+2 \over 1} +1= W1W−3+2+1=W)

The configuration of each pooling layer is

kernel_size = 2
stride = 2
padding = 0
Copy the code

This means that each time it is pooled, the width and height are half of what they were before

MaxPool 60824=38,108824=68,{608 \over 2^4} =38, {1088 \over 2^4} =68, 24608=38,241088=68,

After finishing the calculation, little M learned that he would become 512 dimensions of the small and short appearance. He was a little hesitant, but still walked in.

After a period of time, small M walked out, the next level is the legend of the most difficult RPN.

Where will the 512 * 38 * 68 little M go?