preface

’18, when I was just getting started, I wrote an article like this, and I want to see the previous one. I still have a lot of friends reading it, but I don’t think it’s comprehensive enough. I recently found a better way to do it, and I’m going to share it with you today.

I am a blogger who talks about actual combat, so ~, this article will not talk about principles.

The interpretability of neural networks has always been a hot topic of discussion, especially in classification. If you do not provide a visualization of what your network has learned, the reviewers will probably not let you pass. On the contrary, if you provide a visualization, it will definitely increase the probability of your paper passing. Something like this.

Key points: today introduce a method, do not have to write code, package can be done. Simple, efficient.

First, please copy this address 👇👇👇 github.com/jacobgil/py…

Let’s talk about how to use it.

1.pytorch-grad-camWhat can this library do?

The library provides a variety of activation mapping methods, as follows:

methods What can it do
GradCAM Weight the 2D activations by the average gradient
GradCAM++ Like GradCAM but uses second order gradients
XGradCAM Like GradCAM but scale the gradients by the normalized activations
AblationCAM Zero out activations and measure how the output drops (this repository includes a fast batched implementation)
ScoreCAM Perbutate the image by the scaled activations and measure how the output drops
EigenCAM Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results)
EigenGradCAM Like EigenCAM but with class discrimination: First principle component of Activations*Grad. Looks like GradCAM, but cleaner

2. Installpytorch-grad-cam

pip install grad-cam
Copy the code

3. Specific use cases

3.1 Selecting the Target Layer

You need to select the target layer for which you want to compute the CAM. Some common options are:

  • Resnet18 and 50: model.layer4[-1]
  • VGG and densenet161: model.features[-1]
  • mnasnet1_0: model.layers[-1]
  • ViT: model.blocks[-1].norm1

The target layer is usually the last convolution layer, and if you want to know what the name of the last convolution layer is, you can go back to my notes and click jump, okay

3.2 SINGLE image CAM map

For example, in the figure above, we ask for CAM in the dog category. Picture address. Go download the original image (box.png) and save it in the exampls folder of the current project. The details are as follows, roughly divided into 7 steps

1. Import the relevant packages and load the model

from pytorch_grad_cam import GradCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM
from pytorch_grad_cam.utils.image import show_cam_on_image, \
                                         deprocess_image, \
                                         preprocess_image
from torchvision.models import resnet50
import cv2
import numpy as np
import os

os.environ["KMP_DUPLICATE_LIB_OK"] ="TRUE"

# 1. Load the model
model = resnet50(pretrained=True) 
Copy the code

Here we set pretrained to True because we are directly using the trained model to predict our image. It will take a while to load the pre-training model. If the speed is too slow, use VPN to speed up the download.

I added os.environ[“KMP_DUPLICATE_LIB_OK”]=”TRUE” because MacOS does not add this line. OMP: Error #15: Initializing libiomp5.dylib, but found libomp.dylib already initialized.

2. Select the target layer

# 2. Select the target layer
target_layer = model.layer4[-1]
Copy the code

In Section 3.1 we have given target layers for common models

3. Build a Tensor form of the input image so that it can be sent to the model for calculation

image_path = './examples/both.png'
rgb_img = cv2.imread(image_path, 1)[:, :, ::-1]   # 1 is read RGB
rgb_img = np.float32(rgb_img) / 255

# preprocess_image Effect: Normalize images and translate them into tensor
input_tensor = preprocess_image(rgb_img, mean=[0.485.0.456.0.406],
                                             std=[0.229.0.224.0.225])   # torch.Size([1, 3, 224, 224])
# Create an input tensor image for your model..
# Note: input_tensor can be a batch tensor with several images!
Copy the code

This includes the address of the image, the reading of the image, normalization and translation into Tensor. The image processing here is very simple, but if your model has a specific preprocessing, here needs to follow your Settings, such as image size, channel, etc. Input_tensor here could also be Batch

4. Initialize CAM objects, including model, target layer, whether to use CUDA, etc

# Construct the CAM object once, and then re-use it on many images:
# 4. Initialize GradCAM, including model, target layer and whether cudA is used
cam = GradCAM(model=model, target_layer=target_layer, use_cuda=False)
Copy the code

Here, select the CAM method you want. We chose GradCAM. After creating a CAM object, we can call it again to process many images.

5. Select the target category. If this parameter is not set, the category with the highest score is selected by default

# If target_category is None, the highest scoring category
# will be used for every image in the batch.
# target_category can also be an integer, or a list of different integers
# for every image in the batch.
# 5. Select the target category, if not set, default to the category with the highest score
target_category = None # 281
Copy the code

We need to set up not only the layer that uses the model, but also the CAM that computes that category. If set to None, it means using the category with the highest score, which is usually possible, or specifying a category such as target_category = 281 should be the dog category. I didn’t check 😂.

6. Calculation of CAM

# You can also pass aug_smooth=True and eigen_smooth=True, to apply smoothing.
# 6. Compute CAM
grayscale_cam = cam(input_tensor=input_tensor, target_category=target_category)  # [batch, 224224]
Copy the code

Now that we’ve done our homework, we’re ready to compute CAM. It’s one sentence. It’s that simple. Of course, there are several parameters, interested in their own research. For example, if you want to reduce noise in A CAM and make it better fit an object, two smoothing methods are supported:

  • aug_smooth=True

    Increased test time: Increased the run time by x6

    Apply the horizontal flip combination and pass[1.0, 1.1, 0.9]Image is multiplexed.

    This will better center CAM around the object.
  • eigen_smooth=True

    First principle component of activations*weights

    This has the effect of removing a lot of noise

These two methods can be used separately or together.

An example of Github is shown below, showing the effects of basic CAM, Aug smooth, eigen smooth and Aug +eigen smooth

7. Display and save the thermal map

# In this example grayscale_cam has only one image in the batch:
# 7. Display and save thermal maps, grayscale_CAM is a batch of results, only one can be selected for display
grayscale_cam = grayscale_cam[0]
visualization = show_cam_on_image(rgb_img, grayscale_cam)  # (224, 224, 3)
cv2.imwrite(f'cam_dog.jpg', visualization)
Copy the code

Congratulations, and with luck, you get the following results.

If you use the same model and the same picture as mine, if the result is not so good, there may be something wrong with the step, not just get a similar picture, there may be mistakes in the middle, pay attention to check.

Let’s summarize this part of the code and put it together for everyone to copy.

# Visualize individual images
from pytorch_grad_cam import GradCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM
from pytorch_grad_cam.utils.image import show_cam_on_image, \
                                         deprocess_image, \
                                         preprocess_image
from torchvision.models import resnet50
import cv2
import numpy as np
import os

os.environ["KMP_DUPLICATE_LIB_OK"] ="TRUE"

# 1. Load the model
model = resnet50(pretrained=True)
# 2. Select the target layer
target_layer = model.layer4[-1]
# 3. Build the Tensor form of the input image
image_path = './examples/both.png'
rgb_img = cv2.imread(image_path, 1)[:, :, ::-1]   # 1 is read RGB
rgb_img = np.float32(rgb_img) / 255

# preprocess_image Effect: Normalize images and translate them into tensor
input_tensor = preprocess_image(rgb_img, mean=[0.485.0.456.0.406],
                                             std=[0.229.0.224.0.225])   # torch.Size([1, 3, 224, 224])
# Create an input tensor image for your model..
# Note: input_tensor can be a batch tensor with several images!

# Construct the CAM object once, and then re-use it on many images:
# 4. Initialize GradCAM, including model, target layer and whether cudA is used
cam = GradCAM(model=model, target_layer=target_layer, use_cuda=False)

# If target_category is None, the highest scoring category
# will be used for every image in the batch.
# target_category can also be an integer, or a list of different integers
# for every image in the batch.
# 5. Select the target category, if not set, default to the category with the highest score
target_category = None # 281

# You can also pass aug_smooth=True and eigen_smooth=True, to apply smoothing.
# 6. Compute CAM
grayscale_cam = cam(input_tensor=input_tensor, target_category=target_category)  # [batch, 224224]

# In this example grayscale_cam has only one image in the batch:
# 7. Display and save thermal maps, grayscale_CAM is a batch of results, only one can be selected for display
grayscale_cam = grayscale_cam[0]
visualization = show_cam_on_image(rgb_img, grayscale_cam)  # (224, 224, 3)
cv2.imwrite(f'cam_dog.jpg', visualization)
Copy the code

3.3 Batch image processing

The above is the way we deal with an image, but we often need to deal with many images, that is nothing more than adding a loop.

Here’s just one idea: put all image addresses in a list, then loop through the list, do [load image, process image, calculate CAM and save].

3.4 A CAM computing template

It would be unscientific if we had to change the internal code every time we evaluated a different image, so we could wrap up the code and just change the parameters each time. The entire code is as follows, copy from the original author, really kudos to these serious academic author, he keeps updating the code, you can check it out more on Github.

# copy from https://github.com/jacobgil/pytorch-grad-cam/blob/master/cam.py

import argparse
import cv2
import numpy as np
import torch
from torchvision import models

from pytorch_grad_cam import GradCAM, \
                             ScoreCAM, \
                             GradCAMPlusPlus, \
                             AblationCAM, \
                             XGradCAM, \
                             EigenCAM, \
                             EigenGradCAM

from pytorch_grad_cam import GuidedBackpropReLUModel
from pytorch_grad_cam.utils.image import show_cam_on_image, \
                                         deprocess_image, \
                                         preprocess_image


# OMP: Error #15: Initializing libiomp5.dylib, but found libomp. Dylib already initialized
import os
os.environ["KMP_DUPLICATE_LIB_OK"] ="TRUE"

def get_args() :
    parser = argparse.ArgumentParser()
    parser.add_argument('--use-cuda', action='store_true', default=False.help='Use NVIDIA GPU acceleration')
    parser.add_argument('--image-path'.type=str, default='./examples/both.png'.help='Input image path')
    parser.add_argument('--aug_smooth', action='store_true'.help='Apply test time augmentation to smooth the CAM')
    parser.add_argument('--eigen_smooth', action='store_true'.help='Reduce noise by taking the first principle componenet'
                        'of cam_weights*activations')
    parser.add_argument('--method'.type=str, default='gradcam',
                        choices=['gradcam'.'gradcam++'.'scorecam'.'xgradcam'.'ablationcam'.'eigencam'.'eigengradcam'].help='Can be gradcam/gradcam++/scorecam/xgradcam'
                             '/ablationcam/eigencam/eigengradcam')

    args = parser.parse_args()
    args.use_cuda = args.use_cuda and torch.cuda.is_available()
    if args.use_cuda:
        print('Using GPU for acceleration')
    else:
        print('Using CPU for computation')

    return args


if __name__ == '__main__':
    """ python cam.py -image-path 
      
        Example usage of loading an image, and computing: 1. CAM 2. Guided Back Propagation 3. Combining both """
      

    args = get_args()
    methods = \
        {"gradcam": GradCAM,
         "scorecam": ScoreCAM,
         "gradcam++": GradCAMPlusPlus,
         "ablationcam": AblationCAM,
         "xgradcam": XGradCAM,
         "eigencam": EigenCAM,
         "eigengradcam": EigenGradCAM}

    model = models.resnet50(pretrained=True)

    # Choose the target layer you want to compute the visualization for.
    # Usually this will be the last convolutional layer in the model.
    # Some common choices can be:
    # Resnet18 and 50: model.layer4[-1]
    # VGG, densenet161: model.features[-1]
    # mnasnet1_0: model.layers[-1]
    # You can print the model to help chose the layer
    target_layer = model.layer4[-1]

    cam = methods[args.method](model=model,
                               target_layer=target_layer,
                               use_cuda=args.use_cuda)

    rgb_img = cv2.imread(args.image_path, 1)[:, :, ::-1]
    rgb_img = np.float32(rgb_img) / 255
    input_tensor = preprocess_image(rgb_img, mean=[0.485.0.456.0.406],
                                             std=[0.229.0.224.0.225])

    # If None, returns the map for the highest scoring category.
    # Otherwise, targets the requested category.
    target_category = None

    # AblationCAM and ScoreCAM have batched implementations.
    # You can override the internal batch size for faster computation.
    cam.batch_size = 32

    grayscale_cam = cam(input_tensor=input_tensor,
                        target_category=target_category,
                        aug_smooth=args.aug_smooth,
                        eigen_smooth=args.eigen_smooth)

    # Here grayscale_cam has only one image in the batch
    grayscale_cam = grayscale_cam[0, :]

    cam_image = show_cam_on_image(rgb_img, grayscale_cam)

    gb_model = GuidedBackpropReLUModel(model=model, use_cuda=args.use_cuda)
    gb = gb_model(input_tensor, target_category=target_category)

    cam_mask = cv2.merge([grayscale_cam, grayscale_cam, grayscale_cam])
    cam_gb = deprocess_image(cam_mask * gb)
    gb = deprocess_image(gb)

    cv2.imwrite(f'{args.method}_cam.jpg', cam_image)
    cv2.imwrite(f'{args.method}_gb.jpg', gb)
    cv2.imwrite(f'{args.method}_cam_gb.jpg', cam_gb)
Copy the code

You can use the following call from the terminal

python cam.py --image-path <path_to_image> --method <method>
Copy the code

Such as:

 python cam.py --image-path './examples/both.png' --method 'gradcam' 
Copy the code

A tip. I like to open the terminal directly in PyCharm so that I don’t have to activate the environment and switch addresses.

The article was fitful and took a week’s break to complete perfectly. I like vanity, I am happy when someone praises me, I am happy when I have a new fan, I am happy when someone affirms my contribution, and the passion to share is also higher. So, such a long article to see here, feel good, must pay attention to, like, tell me the message oh.