There was a lot of important research this week. A few days ago, CMU and Facebook announced new advances in AI multiplayer Texas Hold ’em, published in Science. In addition, new papers such as BlazeFace, a sub-millisecond face detection model running on gpus on mobile devices, have been published on Google blog and arXiv.

1. Superhuman AI for Multiplayer Poker

  • Noam Brown, Tuomas Sandholm

  • Link: https://science.sciencemag.org/content/early/2019/07/10/science.aay2400

Abstract: In recent years, AI research has made great progress, especially in all kinds of games. Poker is one of them. In the past, the benchmark performance of AI was always achieved with two players. However, poker is traditionally a multiplayer game with more than two players. Multiplayer games have always posed more problems than two-player games, and solving these problems is seen as a milestone in AI research. In this paper, the researchers propose an AI called Pluribus. The AI outperformed the top human players in a six-player unlimited Texas Hold ’em game.

Recommended: AI beats top human players at multiplayer Texas Hold ’em at $1,000 an hour, and training requires only a cloud server, no GPU, and less than $150. The paper has been published in “Science”.

2. Title: Adversarial Objects Against LiDAR-based Autonomous Driving Systems

  • Authors: Yulong Cao, Chaowei Xiao, Dawei Yang, Jing Fang, Ruigang Yang, Mingyan Liu, Bo Li

  • Link: https://arxiv.org/pdf/1907.05418.pdf

Abstract: Deep neural networks (DNN) are vulnerable to adversarial samples, which has been proved by many studies. To demonstrate that such attacks pose a real-world threat, some studies have proposed generating physical stickers or printable stickers that confuse classifiers into identifying stop signs, such as the Tesla Anti-attack experiment. But autopilot systems aren’t just image classifiers. To get a clearer perception image, most autopilot detection systems are equipped with lidar or conventional radar (radio detection and ranging) equipment that can directly survey the 3D environment using a laser beam. This raises a question: does texture interference affect the point clouds scanned by lidar?

To answer this question, the researchers proposed an optimisation-based approach, LiDAR-ADV, to generate adversarial samples that can evade the LiDAR detection system in a variety of scenarios, thus exposing potential vulnerabilities in the LiDAR autonomous driving detection system.

The researchers first demonstrated the vulnerabilities using a black-box evolution-based algorithm, and then used a gradient-based approach, Lidar-adv, to explore just how much of an impact a strong adversative sample could have.

To assess the real-world impact of LiDAR-ADV, the researchers 3D-printed the generated counter samples and tested them on Baidu’s Apollo autonomous driving platform. The results showed that, with the help of 3D sensing and production-grade multistage detectors, they were able to mislead the autopilot system to achieve different antagonistic objectives.

Recommendation: The University of Michigan, UIUC and Baidu used 3D-printed objects to create countermeasure samples, which can effectively fool the three-dimensional sensors of autonomous vehicles. The paper shows that even the use of expensive lidar is not safe, and improving the robustness of the algorithm itself is the solution.

3. Title: BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

  • Authors: Valentin Bazarevsky, Yury Kartynnik, Andrey Vakunov, Karthik Raveendran, Matthias Grundmann

  • Link: https://arxiv.org/pdf/1907.05047

Abstract: In this paper, we present a face-recognition detector named BlazeFace. This model is lightweight and performs well, and can be inferred in a mobile device GPU, running on hardware that is a flagship device with speeds of 200-1000+ FPS. Such a model can be applied to any AUGMENTED reality task that requires precise face regions, including 2D/3D face keypoints or geometric estimation, facial feature or expression recognition, and facial region segmentation.

Contributions of the paper include: A lightweight feature extraction network based on MobileNetV1/V2 but different from the model, a gpu-friendly model solution improved from the Single Shot MultiBox Detector (SSD) solution, and an improved, A joint resolution strategy used to replace non-maximized compression.

Recommendation: Lightweight, fast and accurate face recognition model has always been the focus of research. Researchers at Google have come up with a model that can run using only a mobile device’s GPU, with extremely fast recognition. With such a model, downstream augmented reality mobile applications can be further developed.

4. Title: Multilingual Universal Sentence Encoder for Semantic Retrieval

  • Yinfei Yang, Amin Ahmad

  • Link: https://ai.googleblog.com/2019/07/multilingual-universal-sentence-encoder.html

Abstract: Three new multilingual modules for Universal Sentence Encoder are proposed, which add some additional features and expand their potential applications. The first two modules provide multilingual models for extracting semantically similar text. The first model is optimized for extraction performance, while the other is optimized for faster speed and smaller memory usage. The third model is specially applied to the extraction question answering task in 16 languages, which is a new application of general sentence encoder. All three multilingual modules are trained using the multi-task dual encoder framework, which is the same as the original English universal sentence encoder. But researchers have developed a technique that uses additive margin Softmax (Additive Margin Softmax) to improve the performance of dual encoders. This technique can ensure good performance not only in transfer learning but also in semantic extraction tasks.

Recommendation: General sentence encoders can be better converted into sentence-level representations than word vectors, and are widely used in sentence similarity extraction tasks. The paper, mentioned on the Google blog, extends the generic sentence encoder approach to multiple languages and extends it to question and answer tasks – an area not covered by previous generic sentence encoder research.

5. Reinforcement Learning Benchmarking Model-based Reinforcement Learning

  • Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen et al

  • Link: https://arxiv.org/pdf/1907.02057v1.pdf

Abstract: It is widely believed that model-based reinforcement learning (MBRL) may have stronger sample efficiency than model-free RL. However, model-based RL research is not so normalized. Therefore, researchers often conduct experiments in self-designed environments and divide them into several independent research directions, which are sometimes closed source or reproducible. Therefore, an open question is how these different EXISTING MBRL algorithms perform tasks with each other.

In order to promote MBRL research, the researchers collected a series of MBRL algorithms and proposed 18 benchmark environments specifically designed for MBRL. These algorithms were benchmarked with a uniform set of questions, including noisy environments. In addition to cataloguing performance, the potential algorithmic differences among different MBRL algorithms were explored and unified. They describe three key challenges for future MBRL research: the dynamic bottleneck, the Planning horizon dilemma, and the early termination dilemma.

Recommendation: Model-based reinforcement learning research has been lacking standardized benchmarking, which hinders reproducibility models or performance research. University of Toronto, in collaboration with UC Berkeley, conducted benchmark tests on some reinforcement learning models, and reviewed the model-based reinforcement learning methods in general, which can be regarded as an important reference for learning various reinforcement learning models and understanding model testing benchmarks.

6. Title: Playing Go without Game Tree Search Using Convolutional Neural Networks

  • Jeffrey Barratt and Chuanbo Pan

  • Link: https://arxiv.org/pdf/1907.04658.pdf

Abstract: As we all know, Go has a long history in East Asian countries, but only in recent years has Computer Go caught up with the performance of human players. While the rules of Go are simple, the game’s strategies and combinations are extremely complex. Even in the past few years, new programs that rely on neural networks to assess disk states can still explore many orders of magnitude more disk states per second than professional gamers.

In this paper, the researchers intend to simulate human intuition in games by creating convolutional neural strategy networks, which should achieve or exceed the level of most human players without requiring any tree search. They describe three structures and training methods designed to create strong Go players: non-rectangular convolution (to better learn the situation on the board), supervised learning (trained on a dataset of 53,000 professional Go games) and reinforcement learning (trained on games played in different online versions). The results showed that, using supervised learning alone, the researchers’ proposed network exceeded the skill level of the average amateur. Further training and implementation of non-rectangular convolution and reinforcement learning will further improve the level of computer Go.

Recommendation: Although AlphaGo and other AI have surpassed the level of top human players in Go, they rely on the method of tree search, which requires a lot of time to learn and consumes a lot of computing resources. Researchers at Stanford University have come up with deep learning methods that can reach the level of human gamers using supervised learning alone.

7. Title: Unsupervised Data Augmentation for Consistency Training

  • Authors: Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

  • Link: https://arxiv.org/pdf/1904.12848

Abstract: In this paper, the researchers propose to augment the application of unlabeled data in a semi-supervised learning environment. Their proposed approach, called unsupervised data augmentation, or UDA, forces model predictions to be consistent between unlabeled and augmented unlabeled samples. Instead of using random noise such as Gaussian noise or Dropout noise, UDA makes a small change and uses the SOTA data augmentation method to generate larger and more realistic noise. Even when the annotation set was extremely small, this small change resulted in a significant improvement in the performance of six verbal tasks and three visual tasks.

For example, on IMDb text classification data set, UDA achieved 4.20% error rate with only 20 labeled samples, which is better than SOTA model trained on 25,000 labeled samples. UDA outperformed all previous comparable methods on the standard semi-supervised learning benchmark CIFAR-10 and SVHN data sets, achieving a 2.7% error rate on the CIFAR-10 data set using only 4000 samples. Using only 250 samples, an error rate of 2.85% was achieved on the SVHN dataset, and these numbers are almost comparable to the performance of the model trained on the complete dataset (one to two orders of magnitude larger than the CIFAR-10 and SVHN dataset). In addition, UDA also performs very well on large data sets such as ImageNet. When training with 10% annotation set, UDA improved top-1/top-5 accuracy from 55.1/77.3% to 68.7/88.5%. For a full ImageNet dataset with 1.3M of additional unlabeled data, UDA further improved performance from 78.3/94.4% to 79.9/94.5%.

Recommendation: This new paper by Quoc V. Le et al presents a data enhancement approach that achieves SOTA levels that rely on large amounts of data training with very few data samples. Such data enhancement methods can further inspire model research in few-shot and zero-shot, and further reduce the dependence of deep learning models on data.