This article mainly demonstrates how to use the Inception V3 model for image recognition. 01 – a simple linear model | 02 – convolution neural network | 03 – PrettyTensor | 04 – save & restore 05-06 – CIFAR integrated learning | 10

By Magnus Erik Hvass Pedersen/GitHub/Videos on YouTube 英 文翻译

If reproduced, please attach a link to this article.


introduce

This tutorial demonstrates how to classify images using a pre-trained deep neural network called Inception V3.

The Inception V3 model was trained on a Beast computer with 8 Tesla K40 GPUs costing approximately $30,000 for several weeks, so it is not possible to train on a normal PC. We will download the pre-trained Inception model and use it for image classification.

The Inception V3 model has approximately 25 million parameters, and it takes 5 billion multiplication-plus instructions to classify an image. On a modern PC without a GPU, classifying an image can be done in a blink of an eye.

This tutorial hides the TensorFlow code, so it may not require much experience with TensorFlow, although it is useful to gain a basic understanding of TensorFlow from previous tutorials, especially if you want to learn the implementation details in the inception. Py file.

The flow chart

The following flow chart shows the data flow in the Inception V3 model, which is a convolutional neural network with many layers and a complex structure. This paper has more details on how the Inception model was constructed and why it was designed. But the authors admit they don’t fully understand how the model works.

Note that the Inception model has two Softmax outputs. One is used in the training of neural networks; the other is used in the categorization of images after the training, namely, the inference phase.

The new model was distributed just last week, which is more complex than Inception V3 and also achieves better classification accuracy.

from IPython.display import Image, display
Image('images/07_inception_flowchart.png')Copy the code

The import

%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import os

# Functions and classes for loading and using the Inception model.
import inceptionCopy the code

Developed using Python3.5.2 (Anaconda), the TensorFlow version is:

tf.__version__Copy the code

‘0.10.0 rc0’

Download Inception model

Download the Inception model from the Web. This is the default folder where you save your data files. The folder is created automatically if it does not exist.

# inception.data_dir = 'inception/'Copy the code

If the Inception model does not exist in the folder, it is downloaded automatically. It is 85 MB.

inception.maybe_download()Copy the code

Downloading Inception v3 Model …

Data has apparently already been downloaded and unpacked.

Loading Inception model

Load the model to prepare for image classification.

Note that these warnings may cause the program to fail in the future.

model = inception.Inception()Copy the code

/ home/magnus/anaconda3 / envs/tensorflow/lib/python3.5 / site – packages/tensorflow/python/ops/array_ops py: 1811: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future result_shape.insert(dim, 1)

Classification and drawing image help function

This is a simple encapsulation function that displays the image, then classifies it with the Inception model, and finally prints out the classification score.

def classify(image_path):
    # Display the image.
    display(Image(image_path))

    # Use the Inception model to classify the image.
    pred = model.classify(image_path=image_path)

    # Print the scores and names for the top-10 predictions.
    model.print_scores(pred=pred, k=10, only_first_name=True)Copy the code

The panda

The Inception data file contains this panda image. The Inception model is fairly certain that the panda shown on this image has a classification score of 89.23%, with the second highest representing a score of 0.86% for the great lemur, another exotic animal.

image_path = os.path.join(inception.data_dir, 'cropped_panda.jpg')
classify(image_path)Copy the code

Giant Panda 0.86% : Indri 0.26% : Lesser Panda 0.14% : Custard Apple 0.11% : EarthStar 0.08% : Sea urchin 0.05% : Forklift 0.05% : soccer ball 0.05% : Go-kart 0.05% : Digital Watch

Explanation of classification scoring

The output of the Inception model is the Softmax function, which was also useful for neural networks in the previous tutorial.

The Softmax output is sometimes called a probability distribution (probabilities) because it is between zero and one and then adds up to one, the same as the probability distribution. But they are not traditional semantic probability distributions, because they are not repeated experiments.

It may be better to call the neural network’s output value a classification score or ranking, because the results show how strong the neural network thinks the input image is for each possible classification.

In the panda sample above, the Inception model gives the panda type a high score — 89.23%, while the other 999 species have scores below 1%. This means that the Inception model is quite confident that the image shows a panda, while the remaining less than 1% should be considered noise. For example, the tenth-highest score, 0.05%, represents an electronic watch, but it’s more likely due to an imprecision of the neural network than a hint that the image looks a bit like an electronic watch.

Sometimes the Inception model does not determine which category the image belongs to, so there is not a particularly high score in the results. A sample of this is shown below.

Parrot (original image)

Inception model is quite certain (score 97.30%) This image shows a species of parrot called a macaw.

classify(image_path="images/parrot.jpg")Copy the code

97.30% : Macaw 0.07% : African Grey 0.07% : Toucan 0.05% : Jacamar 0.04% : Bee Eater 0.04% : Lorikeet 0.02% : Sulphur – Crested Cockatoo 0.02% : Jay 0.01% : kite 0.01% : Sandbar

Parrot (adjust image)

Inception is used on 299 x 299 pixel input images. The parrot image above is actually 320 pixels wide and 785 pixels high, so it will be automatically scaled by the Inception model.

Now we want to look at the images that have been adjusted by the Inception model.

First we implement a help function to retrieve the adjusted image from inside the Inception model.

def plot_resized_image(image_path):
    # Get the resized image from the Inception model.
    resized_image = model.get_resized_image(image_path=image_path)

    # Plot the image.
    plt.imshow(resized_image, interpolation='nearest')

    # Ensure that the plot is shown.
    plt.show()Copy the code

Now draw the adjusted parrot diagram. This is the actual input image of the neural network in the Inception model. We can see that it is compressed into a square and the resolution is reduced so that the image looks more pixelated and jagged.

In this case, the image still clearly shows a parrot, but some of the images are distorted after native adjustments (in the model), so you may want to resize the image yourself and input it into the Inception model.

plot_resized_image(image_path="images/parrot.jpg")Copy the code

Parrot (cropped image, top)

The parrot image is manually cropped to 299 x 299 pixels and then entered into the Inception model with confidence (score 97.38%) that the input image shows a parrot (macaw).

classify(image_path="images/parrot_cropped1.jpg")Copy the code

97.38% : Macaw 0.09% : African Grey 0.03% : Sulphur crested cockatoo 0.02% : Toucan 0.02% : Reflex camera 0.01% Comic Book: Backpack: bib: Vulture: Lens Cap

Parrot (cropped image, center)

Here’s another cropped image of a parrot, this time showing the torso, excluding the head and tail. Inception model is still pretty sure (score 93.94%) that this is a macaw.

classify(image_path="images/parrot_cropped2.jpg")Copy the code


93.94% : macaw


0.77% : toucan


0.55% : substitutes grey


0.13% : jacamar


0.12% : bee eater


0.11% : sulphur – crested cockatoo


0.10% : pay-per-tweet


0.09% : jay


0.07% : lorikeet


0.05% : hornbill

Parrot (cropped image, bottom)

This cropping image only shows the parrot’s tail. At the moment the Inception model is rather confused, believing that the image could be showing a Pitchbird (26.11%), another exotic bird, or a grasshopper (10.61%).

The Inception model also suggests that the image might be a pen (score 2%). But this is a low score and should be interpreted as unreliable noise.

classify(image_path="images/parrot_cropped3.jpg")Copy the code

26.11% : jacamar
10.61% : grasshopper
 4.05% : chime
 2.24% : bulbul
 2.00% : fountain pen
 1.60% : leafhopper
 1.26% : cricket
 1.25% : kite
 1.13% : macaw
 0.80% : torchCopy the code

Parrot (Fill image)

For Inception model, the best way to input images is to fill the image into a square and then adjust it to 299 x 299 pixels. In such parrot sample, the model is correctly classified and scored 96.78%.

classify(image_path="images/parrot_padded.jpg")Copy the code


96.78% : macaw


0.06% : toucan


0.06% : substitutes grey


0.05% : bee eater


0.04% : sulphur – crested cockatoo


0.03% : king penguin


0.03% : jacamar


0.03% : lorikeet


0.01% : kite


0.01% : anemone fish

Elon Musk (299 x 299 pixels)

This image shows Elon Musk — living legend, super-nerd-hero. But the Inception model is confused about what the image shows, predicting that it might be a sweatshirt (score 19.73%) or an abaya (score 16.82). It also thought the image might be a ping-pong ball (3.05%) or a baseball (1.86%). Inception model is confusing and classification scoring is unreliable.

classify(image_path="images/elon_musk.jpg")Copy the code


19.73% : sweatshirt


16.82% : abaya


4.17% : suit


3.46% : trench coat


3.05% : ping – pong ball


1.92% : cellular telephone


1.86% : baseball


1.77% : the jersey


1.54% : kimono


1.43% : water bottle

Elon Musk (100 x 100 pixels)

If we use a 100 x 100 pixel Elon Mask image, the Inception model thinks it might be a sweatshirt (score 17.85%), or a cowboy boot (score 16.36%). Now the Inception model makes some different predictions, but it’s still confusing.

classify(image_path="images/elon_musk_100x100.jpg")Copy the code


17.85% : sweatshirt


16.36% : cowboy boot


10.68% : the balance beam


8.87% : abaya


5.36% : suit


4.57% : Loafer


2.94% : trench coat


2.65% : maillot


1.87% : the jersey


1.42% : unicycle

The Inception model automatically enlarges the image from 100 x 100 to 299 x 299 pixels, as shown below. Look at how pixelated and jagged it is, even though a human could easily see that this is an image of a man with his arms crossed.

plot_resized_image(image_path="images/elon_musk_100x100.jpg")Copy the code

Charlie and the Chocolate Factory (Gene Wilder)

This image shows actor Gene Wilder’s character in the 1971 film Charlie and the Chocolate Factory. Inception model is pretty sure that the image shows a bow tie (score 97.22%), although this is true, the human is likely to say that the image shows a person.

The reason may be that the Inception model classifies the image of a bow-wearing person as a bow tie rather than a person during training. So maybe the category name should be “bow tie wearers” instead of just “bow ties.”

classify(image_path="images/willy_wonka_old.jpg")Copy the code


97.22% : bow tie


0.92% : cowboy hat


0.21% : sombrero


0.09% : suit


0.06% : bolo tie


0.05% : Windsor tie


0.04% : cornet


0.03% : flute


0.02% : banjo,


0.02% : revolver

Charlie and the Chocolate Factory (Johnny Depp)

This image shows actor Johnny Depp’s role in the 2005 film Charlie and the Chocolate Factory. The Inception model considers the image to be “sunglasses” (31.48%) or “Sunglass” (18.77). In fact, the full name of the first category is “Sunglasses, Dark Glasses, Sunglasses.” For some reason, the Inception model is trained to recognize two similar glasses. Again, the image shows sunglasses, which is true, but a human would probably say that the image shows a person.

classify(image_path="images/willy_wonka_new.jpg")Copy the code


31.48% : sunglasses


18.77% : sunglass


1.55% : velvet


1.02% : wig


0.77% : cowboy hat


0.69% : seat belt


0.67% : sombrero


0.62% : jean


0.46% : poncho


0.43% : the jersey

Close the TensorFlow session

We have now completed the task with TensorFlow, closing the session and freeing resources. Note that tensorflow-session is inside the model, so we close it through the model.

# This has been commented out in case you want to modify and experiment
# with the Notebook without having to restart it.
# model.close()Copy the code

conclusion

This tutorial explains how to use a pre-trained Inception V3 model. It took weeks to train on a Beast computer. But we can download the finished model from the Internet and use it to do image classification on a regular PC.

Unfortunately, the Inception model is problematic for identifying people. This may be the reason for the training set used. A new version of the Inception model has also been released, but it may also be trained on the same training set and still have problems identifying people. It is hoped that future models will be trained to recognize common objects, such as humans.

In this tutorial we hide the implementation details of TensorFlow in the document.py file because it is a bit messy and we will probably reuse this in future tutorials. Hopefully the developers of TensorFlow will standardize and simplify the API to make it easier to load these pre-trained models so that everyone can use a powerful image classifier with just a few lines of code.

practice

Here are some suggested exercises that might help you improve your TensorFlow skills. In order to learn how to use TensorFlow more appropriately, practical experience is important.

Before you make changes to this Notebook, you may want to make a backup.

  • Use your own images or those you found on the Internet.
  • Crop, resize, distort the image, see how it affects classification accuracy.
  • Add printed information in several different places in the code. You can also run debugging directlyinception.pyFile.
  • Try using these new models that have just been released. They are loaded differently from the Inception V3 model and can be more challenging to implement.
  • Explain to a friend how the program works.