In this article, I share a feature map visualization code with you.

Welcome to pay attention to the public number CV technical guide, focusing on computer vision technology summary, the latest technology tracking, classic paper interpretation, CV recruitment information.

The words written in the front

Feature map visualization is a job that many papers need to do. Its role can be used to prove the effectiveness of the method, or it can be used to increase the workload and gather up words for the paper.

Specifically, it is to visualize the two graphs, using the new method and before the use of the contrast, what is the difference, and then look at the picture to write a paper to illustrate the role of the new method.

Jokingly, sometimes the author of this graph paper can not understand, although it is true that there are some changes in the visual graph, but he does not understand what this change explains, anyway he bragged, forced to his new way of writing the story, just like the first grade of primary school composition — look at the picture and write a composition.

There is a very hot topic on Zhihu before, if I make a little improvement on the baseline, but there is a great effect, can I write a paper?

In this case, the biggest problem is how to write more than seven pages. That little improvement may take less than one page to write ideas, formula reasoning, drawing and so on. How to do the rest? Visual feature map!!

This point can be reflected in many papers I have read, anyway I did not understand the paper to the visualization, but the author can talk so much. This should be used to increase the number of words in the paper and increase the workload.

In a word, visual feature map is very important work, it is best to be able to.

Initial Configuration

This part firstly completes data loading, network modification, network definition and pre-training model loading.

Load data and preprocess it

Instead of going through the Classdataset by loading only one image, because the Classdataset is for a lot of data, it generates an iterator to batch the images to the network. But we still need to complete the data preprocessing part of the ClassDataset.

All you have to do to pre-process data is resize, translate it into Tensor format, normalize. For other data enhancement or preprocessing operations, add them as needed.

def image_proprecess(img_path): img = Image.open(img_path) data_transforms = transforms.Compose([ transforms.Resize((384, 384), interpolation=3), Transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 7.22])) Transforms (img) data = torch. Unsqueeze (data,0) return dataCopy the code

Since only one image is loaded here, the torch. Unsqueeze is later used to turn the 3d tensor into a 4d one.

Modify the network

If you want to visualize the feature graph of a layer, you need to return the feature graph of that layer, so you need to modify the forward function in the network first. The modification method is as follows:

def forward(self, x):
    x = self.model.conv1(x)
    x = self.model.bn1(x)
    x = self.model.relu(x)
    x = self.model.maxpool(x)
    feature = self.model.layer1(x)
    x = self.model.layer2(feature)
    x = self.model.layer3(x)
    x = self.model.layer4(x)
    return feature,x
Copy the code

Define the network and load the pretraining model

def Init_Setting(epoch):
    dirname = '/mnt/share/VideoReID/share/models/Methods5_trial1'
    model = siamese_resnet50(701, stride=1, pool='avg')
    trained_path = os.path.join(dirname, 'net_%03d.pth' % epoch)
    print("load %03d.pth" % epoch)
    model.load_state_dict(torch.load(trained_path))
    model = model.cuda().eval()
    return model
Copy the code

The only thing that needs to be noted in this section is the last line, which sets the network to inference mode.

Visual feature map

This part is mainly to transform a certain channel of the feature map into a map to visualize.

def visualize_feature_map(img_batch,out_path,type,BI): feature_map = torch.squeeze(img_batch) feature_map = feature_map.detach().cpu().numpy() feature_map_sum = feature_map[0,  :, :] feature_map_sum = np.expand_dims(feature_map_sum, axis=2) for i in range(0, 2048): feature_map_split = feature_map[i,:, :] feature_map_split = np.expand_dims(feature_map_split,axis=2) if i > 0: feature_map_sum +=feature_map_split feature_map_split = BI.transform(feature_map_split) plt.imshow(feature_map_split) plt.savefig(out_path + str(i) + "_{}.jpg".format(type) ) plt.xticks() plt.yticks() plt.axis('off') feature_map_sum = BI.transform(feature_map_sum) plt.imshow(feature_map_sum) plt.savefig(out_path + "sum_{}.jpg".format(type)) print("save sum_{}.jpg".format(type))Copy the code

Here’s a line by line explanation.

1. Parameter IMg_Batch is a feature graph returned from a layer in the network. BI is a function of bilinear interpolation, which is customized and will be discussed below.

2. Since only one image is visualized, img_Batch is 4-dimensional and the batchsize dimension is 1. The third line takes it from the GPU to the CPU and changes it to NUMpy format.

3. The rest part is mainly to convert each channel into a map and add the corresponding positions of each element of all channels and save them.

Bilinear interpolation

After many times of network downsampling, the feature map of the back layer often becomes only 7×7,16×16 size. It’s so small that you need to sample it up, in this case by bilinear interpolation. So, here’s a code for bilinear interpolation.

class BilinearInterpolation(object):
    def __init__(self, w_rate: float, h_rate: float, *, align='center'):
        if align not in ['center', 'left']:
            logging.exception(f'{align} is not a valid align parameter')
            align = 'center'
        self.align = align
        self.w_rate = w_rate
        self.h_rate = h_rate

    def set_rate(self,w_rate: float, h_rate: float):
        self.w_rate = w_rate    # w 的缩放率
        self.h_rate = h_rate    # h 的缩放率

    # 由变换后的像素坐标得到原图像的坐标    针对高
    def get_src_h(self, dst_i,source_h,goal_h) -> float:
        if self.align == 'left':
            # 左上角对齐
            src_i = float(dst_i * (source_h/goal_h))
        elif self.align == 'center':
            # 将两个图像的几何中心重合。
            src_i = float((dst_i + 0.5) * (source_h/goal_h) - 0.5)
        src_i += 0.001
        src_i = max(0.0, src_i)
        src_i = min(float(source_h - 1), src_i)
        return src_i
    # 由变换后的像素坐标得到原图像的坐标    针对宽
    def get_src_w(self, dst_j,source_w,goal_w) -> float:
        if self.align == 'left':
            # 左上角对齐
            src_j = float(dst_j * (source_w/goal_w))
        elif self.align == 'center':
            # 将两个图像的几何中心重合。
            src_j = float((dst_j + 0.5) * (source_w/goal_w) - 0.5)
        src_j += 0.001
        src_j = max(0.0, src_j)
        src_j = min((source_w - 1), src_j)
        return src_j

    def transform(self, img):
        source_h, source_w, source_c = img.shape  # (235, 234, 3)
        goal_h, goal_w = round(
            source_h * self.h_rate), round(source_w * self.w_rate)
        new_img = np.zeros((goal_h, goal_w, source_c), dtype=np.uint8)

        for i in range(new_img.shape[0]):       # h
            src_i = self.get_src_h(i,source_h,goal_h)
            for j in range(new_img.shape[1]):
                src_j = self.get_src_w(j,source_w,goal_w)
                i2 = ceil(src_i)
                i1 = int(src_i)
                j2 = ceil(src_j)
                j1 = int(src_j)
                x2_x = j2 - src_j
                x_x1 = src_j - j1
                y2_y = i2 - src_i
                y_y1 = src_i - i1
                new_img[i, j] = img[i1, j1]*x2_x*y2_y + img[i1, j2] * \
                    x_x1*y2_y + img[i2, j1]*x2_x*y_y1 + img[i2, j2]*x_x1*y_y1
        return new_img
#使用方法
BI = BilinearInterpolation(8, 8)
feature_map = BI.transform(feature_map)
Copy the code

Main function flow

The above describes the code for each part, and the following is the overall flow. It’s easy.

imgs_path = "/path/to/imgs/" save_path = "/save/path/to/output/" model = Init_Setting(120) BI = BilinearInterpolation(8,  8) data = image_proprecess(out_path + "0836.jpg") data = data.cuda() output, _ = model(data) visualize_feature_map(output, save_path, "drone", BI)Copy the code

Visual renderings

Welcome to pay attention to the public number CV technical guide, focusing on computer vision technology summary, the latest technology tracking, classic paper interpretation, CV recruitment information.

CV Technical Guide has created a great environment for communication, except for out-of-the-way questions, which are almost always answered. Concern public number to add edit micro signal can invite to add exchange group.

Other articles

Summary of frame position optimization in target detection

Build Pytorch model from zero

Review papers on autoencoders: concepts, diagrams and applications

To solve the real problem of image segmentation ground scene, Hong Kong Chinese and other proposed: open world entity segmentation

Summary of Anchor-free application methods of target detection, instance segmentation and multi-target tracking

ICLR2022 | cosformer: rethinking the softmax in attention

ICLR2022 | ViDT: a pure transformer target detector is effective and efficient

A summary of some personal habits and thoughts about fast learning a new technology or field

Panoptic SegFormer: An end-to-end Transformer universal framework for panoramic segmentation

CVPR2021 | TrivialAugment: no tuning SOTA data enhancement strategy

ICCV2021 | simple effective long tail visual identity of new scheme: distillation from supervision (SSD)

AAAI2021 | dynamic Anchor learning in any direction target detection

ICCV2021 | of learning time and space for visual tracking type transformer

ICCV2021 | progressive sampling Vision Transformer

MobileVIT: Lightweight Vision Transformer+ mobile deployment

ICCV2021 | SOTR: use the transformer object segmentation

ICCV2021 | PnP – DETR: use the Transformer for effective visual analysis

ICCV2021 | Vision reflection and improvement of the relative position encoding in the Transformer

ICCV2021 | to rethink the visual space dimension of transformers

CVPR2021 | TransCenter: transformer used in multiple target tracking algorithm

New way YOLOF CVPR2021 | characteristics of pyramid

CVPR2021 | to rethink BatchNorm in Batch