Translation of this article:Prototyping Kernels and Advanced Visualization with Python Ops, if there is any infringement, please contact to delete, only for academic exchange, do not commercial. If there are fallacies, please contact to point out.

Kernel operations in TensorFlow are written entirely in C ++ for efficiency. But writing a TensorFlow kernel in C++ can be a pain. Therefore, you may need to prototype an operation before spending hours implementing your own kernel, even though that would be inefficient. With tf.py_func() you can convert any Python source code into TensorFlow operations.

As an example, here is an example of a ReLU nonlinear activation function implemented in Python itself, converted to a TensorFlow operation via tf.py_func() :

import numpy as np
import tensorflow as tf
import uuid

def relu(inputs):
    # Define the op in python
    def _relu(x):
        return np.maximum(x, 0.)

    # Define the op's gradient in python
    def _relu_grad(x):
        return np.float32(x > 0)

    # An adapter that defines a gradient op compatible with TensorFlow
    def _relu_grad_op(op, grad):
        x = op.inputs[0]
        x_grad = grad * tf.py_func(_relu_grad, [x], tf.float32)
        return x_grad

    # Register the gradient with a unique id
    grad_name = "MyReluGrad_" + str(uuid.uuid4())
    tf.RegisterGradient(grad_name)(_relu_grad_op)

    # Override the gradient of the custom op
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": grad_name}):
        output = tf.py_func(_relu, [inputs], tf.float32)
    return output
Copy the code

Using TensorFlow’s gradient Checker, you can verify that these gradients are calculated correctly:

x = tf.random_normal([10])
y = relu(x * x)

with tf.Session():
    diff = tf.test.compute_gradient_error(x, [10], y, [10])
    print(diff)
Copy the code

Compute_gradient_error () numerical value to calculate the gradient, the difference between the returns and the theory of gradient, we expect is a very small difference. Note that our implementation is very inefficient, only when prototyping the model, because Python code is not parallelized and cannot be run on the GPU (resulting in slow speeds). Once you have your idea nailed down, you need to rewrite the kernel in C++. In practice, we typically use Python operations for visualization in Tensorboard. If you are building an image classification model and want to visualize your model predictions during training, TF allows you to visualize images using the tf.summary.image() function.

image = tf.placeholder(tf.float32)
tf.summary.image("image", image)
Copy the code

But this is just visualizing the input image, and in order to visualize the predicted result, you have to find a way to add a prediction flag to the image, which of course does not exist in the existing TensorFlow operation. An easier approach is to use Python to draw the prediction flag onto an image and then wrap it.

import io
import matplotlib.pyplot as plt
import numpy as np
import PIL
import tensorflow as tf

def visualize_labeled_images(images, labels, max_outputs=3, name="image"):
    def _visualize_image(image, label):
        # Do the actual drawing in python
        fig = plt.figure(figsize=(3.3), dpi=80)
        ax = fig.add_subplot(111)
        ax.imshow(image[::- 1. ] ) ax.text(0.0, str(label),
          horizontalalignment="left",
          verticalalignment="top")
        fig.canvas.draw()

        # Write the plot as a memory file.
        buf = io.BytesIO()
        data = fig.savefig(buf, format="png")
        buf.seek(0)

        # Read the image and convert to numpy array
        img = PIL.Image.open(buf)
        return np.array(img.getdata()).reshape(img.size[0], img.size[1].- 1)

    def _visualize_images(images, labels):
        # Only display the given number of examples in the batch
        outputs = []
        for i in range(max_outputs):
            output = _visualize_image(images[i], labels[i])
            outputs.append(output)
        return np.array(outputs, dtype=np.uint8)

    # Run the python op.
    figs = tf.py_func(_visualize_images, [images, labels], tf.uint8)
    return tf.summary.image(name, figs)
Copy the code

Note that because summary is usually evaluated once (and not performed at every step), it can be used in practice without worrying about efficiency.