The author | Ekin Dogus Cubuk, Barret Zoph
The translator | nuclear coke
Edit | Vincent
AI Front Line introduction:Much of the success of deep learning in computer vision can be attributed to the large number of readily available labeled training datasets — as training datasets improve in quality, variety, and scale, our model performance generally improves as well.





However, gathering enough high-quality data to train the model for good performance is often an extremely difficult task. One way to solve this problem is to hardcode the neural network architecture of image symmetry to improve its actual performance. Or experts can manually design data enhancement methods, such as rotations and flips, to train visual models with superior performance.





There hasn’t been much attention on how machine learning can be used to automatically enhance existing data. Inspired by AutoML’s ability to design neural network architectures and optimizers, we wondered: Could data enhancement be automated?






Please pay attention to the wechat public account “AI Front”, (ID: AI-front)

In the previous article (https://arxiv.org/abs/1805.09501), we have discussed a reinforcement learning algorithm, it can increase the number of data of the existing training data set and diversity. In simple terms, the significance of data enhancement is to teach the model to discover image invariance in the data domain, thus ensuring that the neural network is always invariant at these important symmetry levels, and ultimately achieving performance improvement. Unlike previous deep learning models, which require manual design of data enhancement strategies, we use reinforcement learning techniques to find the best image transformation strategies from the data. As a result, the performance of computer vision models is improved without the need for users to provide them with new and ever-expanding data sets.

Training data enhancement

The basic idea behind data enhancement is simple: images have a lot of symmetry that doesn’t affect the information already in the image. For example, a dog is still a dog after specular reflection. While much of this “immutability” is easily understood by humans, some of it is not so intuitive. Hybrid method (https://arxiv.org/abs/1710.09412) in the training process of the image overlap each other in order to realize the data, so as to produce to improve the performance of neural networks in data.

Left: An original image from the ImageNet dataset. Image right: left – to – right version from normal data enhancement transformation.

AutoAugment is a customized automatic data augmentation strategy designed specifically for computer vision data sets. For example, it can perform a variety of basic image conversion operations, such as horizontal/vertical flip, image rotation, and image color change. AutoAugment can not only predict the image transformations that need to be combined, but also predict the probability and magnitude of transformation modes used by each image, so as to ensure that each image can be operated in different ways. AutoAugment is able to retrieve the best strategy from 2.9 x 1032 image transformation possibilities.

AutoAugment can also learn different transformation methods based on the current data set. To include, for example, number of natural scene images, such as house number and street view (SVHN, http://ufldl.stanford.edu/housenumbers/) – AutoAugment will focus on geometric conversion to shear and translation, Thus representing common distortions in data sets. In addition, AutoAugment has learned how to completely alter the color information present in the original SVHN dataset, taking into account the diversity of materials for different buildings and house numbers around the world.

Left: Raw image from the SVHN dataset. Image right: same image after transformation by AutoAugment. In this case, the best transformation is to cut the image and invert the pixel color.

On CIFAR-10 and ImageNet data sets, AutoAugment does not use clipping, because images in these data sets usually do not contain clipping objects. It also does not completely reverse the color, as such a transformation would produce an expression that does not match the actual situation. Instead, AutoAugment focuses on slightly adjusting the color and hue distribution while retaining the regular color attributes. This indicates that the actual color of the object in CIFAR-10 and ImageNet is important, whereas only the relative color is important in SVHN.

Left: Raw image from the ImageNet dataset. Figure RIGHT: Same image transformed by AutoAugment strategy. The contrast on the right is maximized, and the right image is rotated.

Bear fruit

Our AutoAugment algorithm finds a variety of augmentation strategies from a series of the most well-known computer vision data sets, which when combined with neural network training will result in an extremely impressive level of accuracy. By enhancing ImageNet data, we achieved the highest accuracy of 83.54%; For CIFAR-10, our error rate was just 1.48 per cent — 0.83 per cent lower than the scientists’ default data. For SVHN, we reduced the error rate from 1.30% to 1.02%. Importantly, the strategies discovered by AutoAugment are also transferable — the strategies assembled for the ImageNet dataset can be applied to other visual datasets (Stanford Cars, FGVC-aircraft, etc.) to improve the performance of the neural network.

We are pleased to see that our AutoAugment algorithm has achieved a good level of performance over many different competing computer vision data sets and look forward to applying it to more computer vision tasks in the future, even in other areas (such as audio processing or language modeling). Interested friends can click here (https://arxiv.org/abs/1805.09501) to check the relevant thesis appendix, which USES the visual task of its improved model quality.

Sing at all

Special thanks go to Dandelion Mane, Vijay Vasudevan and Quoc V. Le, co-authors of the paper. We would also like to thank Alok Aggarwal, Gabriel Bender, Yanping Huang, Pieter-Jan Kindermans, Simon Kornblith, Augustus Odena, Avital Oliver and others Colin Raffel provided invaluable assistance on this project.

Original link:

https://ai.googleblog.com/2018/06/improving-deep-learning-performance.html