A visual feature map of the code

In this article, I share a feature map visualization code with you.

Welcome to pay attention to the public number CV technical guide, focusing on computer vision technology summary, the latest technology tracking, classic paper interpretation, CV recruitment information.

The words written in the front

Feature map visualization is a job that many papers need to do. Its role can be used to prove the effectiveness of the method, or it can be used to increase the workload and gather up words for the paper.

Specifically, it is to visualize the two graphs, using the new method and before the use of the contrast, what is the difference, and then look at the picture to write a paper to illustrate the role of the new method.

Jokingly, sometimes the author of this graph paper can not understand, although it is true that there are some changes in the visual graph, but he does not understand what this change explains, anyway he bragged, forced to his new way of writing the story, just like the first grade of primary school composition — look at the picture and write a composition.

There is a very hot topic on Zhihu before, if I make a little improvement on the baseline, but there is a great effect, can I write a paper?

In this case, the biggest problem is how to write more than seven pages. That little improvement may take less than one page to write ideas, formula reasoning, drawing and so on. How to do the rest? Visual feature map!!

This point can be reflected in many papers I have read, anyway I did not understand the paper to the visualization, but the author can talk so much. This should be used to increase the number of words in the paper and increase the workload.

In a word, visual feature map is very important work, it is best to be able to.

Initial Configuration

This part firstly completes data loading, network modification, network definition and pre-training model loading.

Load data and preprocess it

Instead of going through the Classdataset by loading only one image, because the Classdataset is for a lot of data, it generates an iterator to batch the images to the network. But we still need to complete the data preprocessing part of the ClassDataset.

All you have to do to pre-process data is resize, translate it into Tensor format, normalize. For other data enhancement or preprocessing operations, add them as needed.

def image_proprecess(img_path): img = Image.open(img_path) data_transforms = transforms.Compose([ transforms.Resize((384, 384), interpolation=3), Transforms.ToTensor(), transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 7.22])) Transforms (img) data = torch. Unsqueeze (data,0) return dataCopy the code

Since only one image is loaded here, the torch. Unsqueeze is later used to turn the 3d tensor into a 4d one.

Modify the network

If you want to visualize the feature graph of a layer, you need to return the feature graph of that layer, so you need to modify the forward function in the network first. The modification method is as follows:

def forward(self, x):
    x = self.model.conv1(x)
    x = self.model.bn1(x)
    x = self.model.relu(x)
    x = self.model.maxpool(x)
    feature = self.model.layer1(x)
    x = self.model.layer2(feature)
    x = self.model.layer3(x)
    x = self.model.layer4(x)
    return feature,x
Copy the code

Define the network and load the pretraining model

def Init_Setting(epoch):
    dirname = '/mnt/share/VideoReID/share/models/Methods5_trial1'
    model = siamese_resnet50(701, stride=1, pool='avg')
    trained_path = os.path.join(dirname, 'net_%03d.pth' % epoch)
    print("load %03d.pth" % epoch)
    model.load_state_dict(torch.load(trained_path))
    model = model.cuda().eval()
    return model
Copy the code

The only thing that needs to be noted in this section is the last line, which sets the network to inference mode.

Visual feature map

This part is mainly to transform a certain channel of the feature map into a map to visualize.

def visualize_feature_map(img_batch,out_path,type,BI): feature_map = torch.squeeze(img_batch) feature_map = feature_map.detach().cpu().numpy() feature_map_sum = feature_map[0,  :, :] feature_map_sum = np.expand_dims(feature_map_sum, axis=2) for i in range(0, 2048): feature_map_split = feature_map[i,:, :] feature_map_split = np.expand_dims(feature_map_split,axis=2) if i > 0: feature_map_sum +=feature_map_split feature_map_split = BI.transform(feature_map_split) plt.imshow(feature_map_split) plt.savefig(out_path + str(i) + "_{}.jpg".format(type) ) plt.xticks() plt.yticks() plt.axis('off') feature_map_sum = BI.transform(feature_map_sum) plt.imshow(feature_map_sum) plt.savefig(out_path + "sum_{}.jpg".format(type)) print("save sum_{}.jpg".format(type))Copy the code

Here’s a line by line explanation.

1. Parameter IMg_Batch is a feature graph returned from a layer in the network. BI is a function of bilinear interpolation, which is customized and will be discussed below.

2. Since only one image is visualized, img_Batch is 4-dimensional and the batchsize dimension is 1. The third line takes it from the GPU to the CPU and changes it to NUMpy format.

3. The rest part is mainly to convert each channel into a map and add the corresponding positions of each element of all channels and save them.

Bilinear interpolation

After many times of network downsampling, the feature map of the back layer often becomes only 7×7,16×16 size. It’s so small that you need to sample it up, in this case by bilinear interpolation. So, here’s a code for bilinear interpolation.

class BilinearInterpolation(object):
    def __init__(self, w_rate: float, h_rate: float, *, align='center'):
        if align not in ['center', 'left']:
            logging.exception(f'{align} is not a valid align parameter')
            align = 'center'
        self.align = align
        self.w_rate = w_rate
        self.h_rate = h_rate

    def set_rate(self,w_rate: float, h_rate: float):
        self.w_rate = w_rate    # w 的缩放率
        self.h_rate = h_rate    # h 的缩放率

    # 由变换后的像素坐标得到原图像的坐标    针对高
    def get_src_h(self, dst_i,source_h,goal_h) -> float:
        if self.align == 'left':
            # 左上角对齐
            src_i = float(dst_i * (source_h/goal_h))
        elif self.align == 'center':
            # 将两个图像的几何中心重合。
            src_i = float((dst_i + 0.5) * (source_h/goal_h) - 0.5)
        src_i += 0.001
        src_i = max(0.0, src_i)
        src_i = min(float(source_h - 1), src_i)
        return src_i
    # 由变换后的像素坐标得到原图像的坐标    针对宽
    def get_src_w(self, dst_j,source_w,goal_w) -> float:
        if self.align == 'left':
            # 左上角对齐
            src_j = float(dst_j * (source_w/goal_w))
        elif self.align == 'center':
            # 将两个图像的几何中心重合。
            src_j = float((dst_j + 0.5) * (source_w/goal_w) - 0.5)
        src_j += 0.001
        src_j = max(0.0, src_j)
        src_j = min((source_w - 1), src_j)
        return src_j

    def transform(self, img):
        source_h, source_w, source_c = img.shape  # (235, 234, 3)
        goal_h, goal_w = round(
            source_h * self.h_rate), round(source_w * self.w_rate)
        new_img = np.zeros((goal_h, goal_w, source_c), dtype=np.uint8)

        for i in range(new_img.shape[0]):       # h
            src_i = self.get_src_h(i,source_h,goal_h)
            for j in range(new_img.shape[1]):
                src_j = self.get_src_w(j,source_w,goal_w)
                i2 = ceil(src_i)
                i1 = int(src_i)
                j2 = ceil(src_j)
                j1 = int(src_j)
                x2_x = j2 - src_j
                x_x1 = src_j - j1
                y2_y = i2 - src_i
                y_y1 = src_i - i1
                new_img[i, j] = img[i1, j1]*x2_x*y2_y + img[i1, j2] * \
                    x_x1*y2_y + img[i2, j1]*x2_x*y_y1 + img[i2, j2]*x_x1*y_y1
        return new_img
#使用方法
BI = BilinearInterpolation(8, 8)
feature_map = BI.transform(feature_map)
Copy the code

Main function flow

The above describes the code for each part, and the following is the overall flow. It’s easy.

imgs_path = "/path/to/imgs/" save_path = "/save/path/to/output/" model = Init_Setting(120) BI = BilinearInterpolation(8,  8) data = image_proprecess(out_path + "0836.jpg") data = data.cuda() output, _ = model(data) visualize_feature_map(output, save_path, "drone", BI)Copy the code

Visual renderings

CV Technical Guide has created a great environment for communication, except for out-of-the-way questions, which are almost always answered. Concern public number to add edit micro signal can invite to add exchange group.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

The words written in the front

Initial Configuration

Load data and preprocess it

Modify the network

Define the network and load the pretraining model

Visual feature map

Bilinear interpolation

Main function flow

Visual renderings

Other articles

A visual feature map of the code

The words written in the front

Initial Configuration

Load data and preprocess it

Modify the network

Define the network and load the pretraining model

Visual feature map

Bilinear interpolation

Main function flow

Visual renderings

Other articles

Related Posts

Take stock of your hut to read 1000+ articles

Enter artificial intelligence, this encyclopedia of artificial intelligence field is not to be missed

Python Advanced Lecture on Quantitative Trading – Stock picking Strategies based on THE ORnell RPS Index