1. Introduction

Data enhancement is a trick commonly used in all kinds of computer vision tasks. Various data enhancement strategies can transform images and generate more value from limited data. It refers to changing the characteristics of samples according to prior knowledge under the condition that the sample labels remain unchanged, so that the newly generated samples also conform to or approximate to the real distribution of data. Data enhancement can prevent over-fitting, and increasing the size of data set is the most important way to solve over-fitting. However, sample data collection and annotation often cost a lot. In limited data sets, data enhancement technology can increase the number of training samples and improve performance to a certain extent. Common data enhancement methods include affine transformation, perspective transformation, brightness transformation, sharpening, blur, occlusion, etc. These enhancement transformations generally do not change the category label of the image target.

Over the years, many data enhancement methods have been proposed in the field of computer vision, but the data enhancement methods corresponding to each data set are manually designed or based on experience, and the data enhancement strategy should be used for each different data set. Before this, there are few researches on this aspect. The core idea of AutoAugment is to find the most appropriate augmentation strategy for each data set by reinforcement learning algorithm.

2. AutoAugment principle

2.1 Search Space

AutoAugment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment augment

AutoAugment defines the search strategy as follows:

  • The search results in an enhanced policy.
  • A policy contains five sub-policies.
  • A sub-policy contains two operations (a total of 16 operations are candidates).
  • Each operation contains two superparameters: probability and magnitude.

A sample search result is shown below:

AutoAugment search space analysis:

  • Discretization of probability and Magnitude: the probability and intensity are respectively discretized into integer values of [0-10] and [1-10], so the appropriate policy can be found through the search algorithm of discrete values.
  • Each sub-policy has two possibilities: (16∗10∗11)2(16*10*11)^2(16∗10∗11).
  • Each policy exists (16 ∗ ∗ 10, 11) 2 ∗ ∗ 5 material 2.9 1032 (16 * 10 * 11) ^ 2 * 5} {material 2.9 * 10 ^ 32 (16 ∗ ∗ 10, 11) 2 ∗ ∗ 5 material 2.9 1032 possible.

2.2 Search Algorithm

Referring to NAS search, reinforcement learning was adopted as the search algorithm. In each round, RNN was used to output data to enhance strategy S, and S was used to train 120 epochs of sub-models on the subset of target detection set to obtain the accuracy of validation set, R, and THEN R was used to optimize RNN. The whole process lasted 15000 times.

Note: The gradient update method here is mainly based on Proximal policy optimization algorithms. Since the training time of this method is very long, the reinforcement learning method is almost no longer used in the future, so there is no further introduction.

2.3 Use of search results

After the search is completed, the five policies with the best performance are extracted and merged into one policy (a total of 25 sub-policies), and one of the 25 sub-policies is randomly selected from each batch of data in formal training.

3. Experimental results

The best strategies found on CIFAR-10:

As shown in the figure below, the search results of each data set are all good, and the classification error rate has decreased to a certain extent:

In addition, applying the data enhancement strategy found in ImageNet to other data sets also has certain migration ability:

The contributions and disadvantages of AutoAugment

Main contributions:

  • The first work of data enhancement strategy optimization proves that the search data enhancement strategy is feasible and effective.
  • The proposed search space (data enhancement strategy definition) is very useful for reference, and the subsequent papers basically follow its search space definition.

Disadvantages:

  • Algorithm search space is huge, to enhance the learning algorithm as a search algorithm is too time-consuming: In the CIFAR10 data set, the performance gain of data enhancement usually appears only after 80-100 epochs are trained. In the search process, 120 epochs and 15,000 submodels need to be trained for each model. In other words, a total of 120∗15000120*15000120∗15000 epochs need to be trained.
  • The improvement effect of search is not obvious compared with random search: For the 25 sub-policies finally searched, there are 50 operations in total. If the operation sequence is disordered and operation, probability and Magnitude are randomly selected each time, the final performance difference is only 0.4 percentage points. Moreover, the author has never compared the performance difference between random search strategy and AutoAugment, and the search results of reinforcement learning algorithm tend to be ineffective.

Reference:

  1. Arxiv.org/pdf/1805.09…
  2. Towardsdatascience.com/how-to-impr…
  3. Medium.com/%E5%B7%A5%E…