What do we do when we don’t have a lot of different training data? This is a quick introduction to using data enhancement in TensorFlow to perform in-memory image transformations during model training to help overcome this data barrier.

! [](https://static001.geekbang.org/infoq/62/62c366963508bb29ba51f19ceddef685.jpeg?x-oss-process=image/resize,p_80/auto-or ient,1)

The success of image classification is at least largely driven by the large amount of available training data. For the time being, the more image data we train, the more chance we have to build an effective model.

But what do we do if we don’t have a lot of training data? Immediately come up with some broad approaches to this particular problem, especially transfer learning and data enhancement capabilities.

Transfer learning is the process of applying existing machine learning models to scenarios that were not originally intended. This utilization can save training time and extend the usefulness of existing machine learning models that may have available data and calculations and have been trained for a long time on very large data sets. If we train the model on a large amount of data, we can optimize the results to be valid for a small amount of data.

Data expansion is an increase in the size and diversity of an existing training data set without the need to manually collect any new data. This enhanced data is obtained by performing a series of pre-processed transformations on existing data, which for image data can include horizontal and vertical flips, tilts, trims, rotations, etc. In short, rather than copying the same data, this expanded data can simulate a variety of slightly different data points. The nuances of these “additional” images should be enough to help train more robust models. Again, that’s the idea.

This paper focuses on the practical implementation of the second method in TensorFlow to alleviate the problem of small amounts of image training data (data enhancement), while similar practical processing of transfer learning will be carried out later.

How does image enhancement help

When the convolutional neural network learns image features, we want to ensure that these features appear in a variety of directions, so that the trained model can recognize that human legs can appear in both vertical and horizontal directions of the image. In addition to increasing the raw number of data points, enhancements can help us in this case by adopting transformations such as image rotation. As another example, we can also use a horizontal flip to help the model train to recognize whether a cat is upright or photographed upside down.

Data enhancement is not a panacea; We don’t expect it to solve all of our small data problems, but it can work in many situations, and can be done by using it as part of a comprehensive model training approach, or with another data set extension technique (e.g., transfer learning)

Image enhancement in TensorFlow

In TensorFlow, data expansion is done using the ImageDataGenerator class. It is very easy to understand and use. The entire dataset is looped through each period, and the images in the dataset are transformed according to the options and values selected. These transformations are performed in memory, so no additional storage is required (although save_to_DIR can be used to save the enhanced image to disk if desired).

If you are using TensorFlow, you have probably already used the simple method of ImageDataGenerator to scale an existing image without any additional augmentation. It might look something like this:

! [](https://static001.geekbang.org/infoq/45/4511dded708197e12dc86f0c6a94ad87.png)

An enhanced update performed by the ImageDataGenerator might look like this:

! [](https://static001.geekbang.org/infoq/25/25d4baef979661d8e7758648721e343b.png)

What does that mean?

  • ** rotATION_range **- Range of degrees of random rotation; In the example above it is 20 degrees

  • **width_shift_range**- Part of the total width (if value <1, in this case) to randomly convert the image horizontally; In the above example, it is 0.2

  • **height_shift_range**- Part of the total height (if value <1, in this case) to pan the image randomly in a vertical direction; In the above example, it is 0.2

  • ** Shear_range **- Counterclockwise shear Angle, in degrees, used for shear conversion; In the above example, it is 0.2

  • **zoom_range**- Random zoom range; In the above example, it is 0.2

  • **horizontal_flip**- Boolean value for horizontal random flip images; True in the example above

  • **vertical_flip**- Boolean value, used for vertically and randomly flipping images; True in the example above

  • **fill_mode**- Fills points outside the input boundary according to “constant”, “nearest”, “reflection” or “surround”; Closest in the example above

You can then use the ImageDataGenerator flow_FROm_directory option to specify the location of the training data (and choose whether to validate if you want to create a validation generator), for example, using the option, The model is then trained with these enhanced images that flow to your network during training using FIT_Generator. An example of this type of code is shown below:

! [](https://static001.geekbang.org/infoq/f3/f317a0e904748bedb57e28d771af4ca0.png)

If you like this article, please like it and forward! thank you

Don’t leave after you see the surprise!

I carefully organized 2TB video courses and books related to computer /Python/ machine learning/deep learning, worth 1W yuan. Follow the wechat official account “Computer and AI” and click the menu below to get the web disk link.

! [](https://static001.geekbang.org/infoq/90/903237ffd0a3b3ae06272386f26ecb9e.png)