Faster R-CNN Visualizations - Network model, image features,Loss diagram,PR curve

Visual network model

Caffe currently has two commonly used visualization models:

Online visualization using Netscope
The draw_net.py file built into the Caffe code package allows you to visualize the network model

Netscope

Netscope can visualize neural network architectures (or technically, Netscope can visualize any directed acyclic graph). Currently Netscope can visualize Caffe’s Prototxt files. Address: ethereon. Making. IO/netscope / # /… The use of Netscope is very simple, just need to copy the Prototxt file to the Netscope edit box, and then press the shortcut key Shift+Enter can get the visual structure of the network model. The advantage of Netscope is that the network model displayed is simple, and the specific parameters of the module will be displayed when you place the mouse over any module of the network model visualized on the right. Figure 1 takes the train. Prototxt file of ZF model in Faster R-CNN as an example

FIG. 1 ZF network mode visualization by Netscope

draw_net.py

Draw_net. py is also used to draw prototxt as network model. Before drawing, two dependent libraries need to be installed:

Install GraphViz # sudo apt-get install GraphViz Install pydot # sudo PIP install Pydot # sudo PIP install pydot

After the installation is complete, you can call draw_net.py to draw the network model, for example, to draw caffe’s built-in LeNet network model:

sudo python python/draw_net.py examples/mnist/lenet_train_test.prototxt netImage/lenet.png --rankdir=TB
Copy the code

There are three parameters, with their meanings as follows:

First parameter: prototxt file of network model second parameter: saved picture path and name second parameter: — rankdir=x, x has four options, respectively LR, RL, TB, BT. Used to indicate the direction of the network, from left to right, right to left, top to small, bottom to top. The default is LR.

The visualization results are as follows:

Figure 2 draw_net.py visualized LeNet network model

Visual image feature

I have also used two ways of visualizing images:

Modify demo.py code to output middle-tier results
Use the visualization tools deep-thinking-Toolbox

Modify the demo. Py

This part refers to the layer-by-layer feature visualization in Xue Kaiyu’s Study Notes on Caffe, and takes ZFNet network training Pascal VOC as an example. After modifying the demo.py file, the code is as follows:

#!/usr/bin/env python
#-*-coding:utf-8-*-


import matplotlib
matplotlib.use('Agg')
import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms
from utils.timer import Timer
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import caffe, os, sys, cv2
import argparse

CLASSES = ('__background__',
           'aeroplane', 'bicycle', 'bird', 'boat',
           'bottle', 'bus', 'car', 'cat', 'chair',
           'cow', 'diningtable', 'dog', 'horse',
           'motorbike', 'person', 'pottedplant',
           'sheep', 'sofa', 'train', 'tvmonitor')

NETS = {'vgg16': ('VGG16',
                  'VGG16_faster_rcnn_final.caffemodel'),
        'zf': ('ZF',
                  'zf_faster_rcnn_iter_2000.caffemodel')}

def vis_detections(im, class_name, dets, thresh=0.5):
    """Draw detected bounding boxes."""
    inds = np.where(dets[:, -1] >= thresh)[0]
    if len(inds) == 0:
        return

    im = im[:, :, (2, 1, 0)]
    fig, ax = plt.subplots(figsize=(12, 12))
    ax.imshow(im, aspect='equal')
    for i in inds:
        bbox = dets[i, :4]
        score = dets[i, -1]

        ax.add_patch(
            plt.Rectangle((bbox[0], bbox[1]),
                          bbox[2] - bbox[0],
                          bbox[3] - bbox[1], fill=False,
                          edgecolor='red', linewidth=3.5)
            )
        ax.text(bbox[0], bbox[1] - 2,
                '{:s} {:.3f}'.format(class_name, score),
                bbox=dict(facecolor='blue', alpha=0.5),
                fontsize=14, color='white')

    ax.set_title(('{} detections with '
                  'p({} | box) >= {:.1f}').format(class_name, class_name,
                                                  thresh),
                  fontsize=14)
    plt.axis('off')
    plt.tight_layout()
    plt.draw()

def demo(net, image_name):
    """Detect object classes in an image using pre-computed object proposals."""

    # Load the demo image
    im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
    im = cv2.imread(im_file)

    # Detect all object classes and regress object bounds
    timer = Timer()
    timer.tic()
    scores, boxes = im_detect(net, im)
    timer.toc()
    print ('Detection took {:.3f}s for '
           '{:d} object proposals').format(timer.total_time, boxes.shape[0])

    # Visualize detections for each class
    CONF_THRESH = 0.8
    NMS_THRESH = 0.3
    for cls_ind, cls in enumerate(CLASSES[1:]):
        cls_ind += 1 # because we skipped background
        cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)]
        cls_scores = scores[:, cls_ind]
        dets = np.hstack((cls_boxes,
                          cls_scores[:, np.newaxis])).astype(np.float32)
        keep = nms(dets, NMS_THRESH)
        dets = dets[keep, :]
        vis_detections(im, cls, dets, thresh=CONF_THRESH)

def parse_args():
    """Parse input arguments."""
    parser = argparse.ArgumentParser(description='Faster R-CNN demo')
    parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
                        default=0, type=int)
    parser.add_argument('--cpu', dest='cpu_mode',
                        help='Use CPU mode (overrides --gpu)',
                        action='store_true')
    parser.add_argument('--net', dest='demo_net', help='Network to use [zf]',
                        choices=NETS.keys(), default='zf')

    args = parser.parse_args()

    return args





if __name__ == '__main__':

    cfg.TEST.HAS_RPN = True  # Use RPN for proposals

    args = parse_args()
    prototxt = os.path.join(cfg.MODELS_DIR, NETS[args.demo_net][0],
                            'faster_rcnn_alt_opt', 'faster_rcnn_test.pt')
    caffemodel = os.path.join(cfg.DATA_DIR, 'faster_rcnn_models',
                              NETS[args.demo_net][1])

    if not os.path.isfile(caffemodel):
        raise IOError(('{:s} not found.\nDid you run ./data/script/'
                       'fetch_faster_rcnn_models.sh?').format(caffemodel))

    if args.cpu_mode:
        caffe.set_mode_cpu()
    else:
        caffe.set_mode_gpu()
        caffe.set_device(args.gpu_id)
        cfg.GPU_ID = args.gpu_id
    net = caffe.Net(prototxt, caffemodel, caffe.TEST)
	#指定caffe路径，以下是我的caffe路径 
    caffe_root='/home/ouyang/GitRepository/py-faster-rcnn/caffe-fast-rcnn/'
    # import sys
    sys.path.insert(0, caffe_root+'python')
    # import caffe

    # #显示的图表大小为 10,图形的插值是以最近为原则,图像颜色是灰色
    plt.rcParams['figure.figsize'] = (10, 10)
    plt.rcParams['image.interpolation'] = 'nearest'
    plt.rcParams['image.cmap'] = 'gray'
    image_file = caffe_root+'examples/images/vehicle_0000015.jpg'  
    # 载入模型
    npload = caffe_root+ 'python/caffe/imagenet/ilsvrc_2012_mean.npy'  
    
    transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
    transformer.set_transpose('data', (2,0,1))
    transformer.set_mean('data', np.load(npload).mean(1).mean(1))
    # 参考模型的灰度为0~255，而不是0~1
    transformer.set_raw_scale('data', 255) 
    # 由于参考模型色彩是BGR,需要将其转换为RGB
    transformer.set_channel_swap('data', (2,1,0))
    im=caffe.io.load_image(image_file)
    net.blobs['data'].reshape(1,3,224,224)
    net.blobs['data'].data[...] = transformer.preprocess('data',im)
    # 显示出各层的参数和形状，第一个是批次，第二个是feature map数目，第三和第四是每个神经元中图片的长和宽
    print [(k,v.data.shape) for k,v in net.blobs.items()]
    #输出网络参数
    print [(k,v[0].data.shape) for k,v in net.params.items()]


   
    def show_image(im):
        if im.ndim==3:
            m=im[:,:,::-1]
        plt.imshow(im)
        #显示图片的方法
        plt.axis('off') # 不显示坐标轴
        plt.show()    

    # 每个可视化的都是在一个由一个个网格组成
    def vis_square(data,padsize=1,padval=0):
        data-=data.min()
        data/=data.max()
        
        # force the number of filters to be square
        n=int(np.ceil(np.sqrt(data.shape[0])))
        padding=((0,n**2-data.shape[0]),(0,padsize),(0,padsize))+((0,0),)*(data.ndim-3)
        data=np.pad(data,padding,mode='constant',constant_values=(padval,padval))
        # 对图像使用滤波器
        
        data=data.reshape((n,n)+data.shape[1:]).transpose((0,2,1,3)+tuple(range( 4,data.ndim+1)))
        data=data.reshape((n*data.shape[1],n*data.shape[3])+data.shape[4:])   
        
        #show_image(data)
        plt.imshow(data)
        plt.show()
        # 设置图片的保存路径，此处是我的路径
        plt.savefig("./tools/Vehicle_2000/fc6.jpg")

    
    out = net.forward()
    image=net.blobs['data'].data[4].copy()
    image-=image.min()
    image/=image.max()
    # 显示原始图像
    show_image(image.transpose(1,2,0))
    #网络提取conv1的卷积核
    filters = net.params['conv1'][0].data
    vis_square(filters.transpose(0, 2, 3, 1))
    #过滤后的输出,96 张 featuremap
    feat =net.blobs['conv1'].data[0,:96]
    vis_square(feat,padval=1)
    #第二个卷积层,显示全部的96个滤波器,每一个滤波器为一行。
    filters = net.params['conv2'][0].data
    vis_square(filters[:96].reshape(96**2, 5, 5))
    # #第二层输出 256 张 featuremap
    feat = net.blobs['conv2'].data[0]
    vis_square(feat, padval=1)


    filters = net.params['conv3'][0].data
    vis_square(filters[:256].reshape(256**2, 3, 3))


    # 第三个卷积层:全部 384 个 feature map
    feat = net.blobs['conv3'].data[0]
    vis_square(feat, padval=0.5)


    #第四个卷积层,我们只显示前面 48 个滤波器,每一个滤波器为一行。
    filters = net.params['conv4'][0].data
    vis_square(filters[:384].reshape(384**2, 3, 3))

    # 第四个卷积层:全部 384 个 feature map
    feat = net.blobs['conv4'].data[0]
    vis_square(feat, padval=0.5)
    # 第五个卷积层:全部 256 个 feature map
    filters = net.params['conv5'][0].data
    vis_square(filters[:384].reshape(384**2, 3, 3))

    feat = net.blobs['conv5'].data[0]
    vis_square(feat, padval=0.5)
    #第五个 pooling 层
    feat = net.blobs['fc6'].data[0]
    vis_square(feat, padval=1)
    第六层输出后的直方分布
    feat=net.blobs['fc6'].data[0]
    plt.subplot(2,1,1)
    plt.plot(feat.flat)
    plt.subplot(2,1,2)
    _=plt.hist(feat.flat[feat.flat>0],bins=100)
    # #显示图片的方法
    #plt.axis('off') # 不显示坐标轴
    plt.show()  
    plt.savefig("fc6_zhifangtu.jpg") 
    # 第七层输出后的直方分布
    feat=net.blobs['fc7'].data[0]
    plt.subplot(2,1,1)
    plt.plot(feat.flat)
    plt.subplot(2,1,2)
    _=plt.hist(feat.flat[feat.flat>0],bins=100)
    plt.show()
    plt.savefig("fc7_zhifangtu.jpg") 
    #看标签
    #执行测试  
    image_labels_filename=caffe_root+'data/ilsvrc12/synset_words.txt'
    #try:
    labels=np.loadtxt(image_labels_filename,str,delimiter='\t')
    top_k=net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1]
    #print labels[top_k]
    for i in np.arange(top_k.size):
        print top_k[i], labels[top_k[i]]
Copy the code

Here are some test results

FIG. 3 Original detection image

Figure 4 Conv1 parameter visualization

FIG. 5 Conv1 feature visualization

deep-visualization-toolbox

Deep – Visualization – Toolbox is the source code for Jason Yosinsk’s paper on visualization of convolutional neural networks published in Computer Science. B site has a video about how to use the tool, here attach a link to www.bilibili.com/video/av740… . The tool is available at github: github.com/yosinski/de… . There are complete installation and configuration steps under github. Again, take the horse in Figure 2 as an example, and paste several pictures of test results.

Figure 6. ToolBox Conv1 feature visualization

Figure 7. ToolBox Conv2 feature visualization

From the detection effect, or very simple. The upper left corner of a column of pictures on the left of the picture is the input picture, the middle part is the visualization of the feature map obtained by the forward propagation of the picture through the network, and the lower left corner is its feature visualization.

Loss visualization

Visualization of Loss value during network training can help analyze whether the parameters of the network model are appropriate. When the Faster R-CNN network training model is used, the loss value of each stage of network training is saved in the log file after the training is completed, as shown in Figure 8. The visualization of Loss can be completed by writing a simple Python program, reading the number of iterations in the log file and the required Loss value, and then drawing a graph.

Figure 8. Training log for the model

#! /usr/bin/env python import os import sys import numpy as np import matplotlib.pyplot as plt import math import re import  pylab from pylab import figure, show, Legend from mpl_toolkits. Axes_grid1 import host_subplot # log file name fp = open('faster_rcnn_end2end_ZF_.txt.2018-04-13_19-46-23', 'r',encoding='UTF-8') train_iterations = [] train_loss = [] test_iterations = [] #test_accuracy = [] for ln in fp: # get train_iterations and train_loss if '] Iteration ' in ln and 'loss = ' in ln: arr = re.findall(r'ion \b\d+\b,',ln) train_iterations.append(int(arr[0].strip(',')[4:])) Train_loss.append (float(ln.strip().split(' = ')[-1])) fp.close() host = host_subplot(111) plt.subplots_adjust(right=0.8)  # ajust the right boundary of the plot window #par1 = host.twinx() # set labels host.set_xlabel("iterations") host.set_ylabel("RPN loss") #par1.set_ylabel("validation accuracy") # plot curves p1, = host.plot(train_iterations, train_loss, label="train RPN loss") . host.legend(loc=1) # set label color host.axis["left"].label.set_color(p1.get_color()) Host. Set_xlim ([- 1000, 60000]) host. Set_ylim ([0., 3.5]) PLT. The draw (PLT), show ()Copy the code

The visualization is shown below

Figure 9 Visualization of Loss

Draw PR

The accuracy rate and recall rate of each image of each type of detection target can be obtained in the same folder of the output network model of the Faster R-CNN training network. The precision-recall (PR) curve can be drawn, and the area of PR curve is the accuracy value. This file is stored in the. PKL file of ==output\faster_rcnn_end2end\ VOC_2007_test \ zf_FAster_rcnn_iter ==, which needs to be converted to a. TXT file. The code is as follows:

#-*-coding:utf-8-*-
import cPickle as pickle
import numpy as np 
np.set_printoptions(threshold=np.NaN) 
fr = open('./aeroplane_pr.pkl'Inf = pickle.load(fr) print inf fo = open("aeroplane_pr.txt"."wb") fo.write(STR (INF)) fo.close() fr.close() # Close the fileCopy the code

After executing this program, the. PKL file will be converted to. TXT file to save. .txt file can directly see the detection accuracy and recall rate of each picture. Drawing PR curve can be completed in the same way as drawing Loss diagram. The effect picture is shown in Figure 10.

FIG. 10 PR curve

reference

[1] Xue Kaiyu, Caffe Learning Notes

[2] Yosinski J, Clune J, Nguyen A, et al. Understanding Neural Networks Through Deep Visualization[J]. Computer Science, 2015.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Faster R-CNN Visualizations — Network model, image features,Loss diagram,PR curve

Visual network model

Netscope

draw_net.py

Visual image feature

Modify the demo. Py

deep-visualization-toolbox

Loss visualization

Draw PR

reference

Faster R-CNN Visualizations — Network model, image features,Loss diagram,PR curve

Visual network model

Netscope

draw_net.py

Visual image feature

Modify the demo. Py

deep-visualization-toolbox

Loss visualization

Draw PR

reference

Related Posts

Face recognition based on MATLAB GUI skin color face recognition positioning

Shortest path algorithm: Bellman- Ford algorithm

TensorFlow Java+ Eclipse environment construction