Abstract:On April 25, Huawei Cloud released Pangu series of super-large scale pre-training models, including the world’s largest visual (CV) pre-training model with 3 billion parameters, and the world’s largest Chinese language (NLP) pre-training model with 100 billion parameters and 40TB training data jointly developed with Circirculating Intelligence and Pengcheng Laboratory. In the future, Huawei Cloud will release multi-modal, scientific computing and other super-large pre-training models.

This share since huawei Cloud community the HDC. 2021 | huawei Cloud Cloud training models of the world’s largest, open the development of new industrialization AI mode “, the original author: technology torch bearers.

On April 25, Huawei Cloud released Pangu series of super-large scale pre-training models, including the world’s largest visual (CV) pre-training model with 3 billion parameters, and the world’s largest Chinese language (NLP) pre-training model with 100 billion parameters and 40TB training data jointly developed with Circirculating Intelligence and Pengcheng Laboratory. In the future, Huawei Cloud will release multi-modal, scientific computing and other super-large pre-training models.

Huawei cloud artificial intelligence field chief scientist, IEEE Fellow Tian Qi said: “pre-training large model is an important method to solve the customization and fragmentation of AI application development. Huawei Cloud Pangu large model can realize a large AI model in many scenes, generalization and scale replication, reduce the reliance on data annotation, and use ModelArts platform, so that AI development from workshop to industrial development of a new mode.”

▲ Huawei Cloud Artificial Intelligence Chief Scientist, IEEE Fellow Tian Qi

The world’s largest Chinese language pre-training model has broken three world records for Clue

Pangu NLP model is the world’s largest Chinese language pre-training model with hundreds of billions of parameters. It was jointly developed by Huawei Cloud, Cyclone Intelligence and Pengcheng Laboratory. In the pre-training stage, 40TB of Chinese text data was learned, and the application performance of the model in the scene was improved through sample tuning of industry data.

Pangu NLP model has achieved breakthrough progress in three aspects:

First, with advanced language understanding and model generation capabilities:In the authoritative Chinese language comprehension assessment benchmark CLUE list, the Pangu NLP model ranked first in the overall ranking, classification and reading comprehension categories, breaking three world historical records in the list. The total ranking score is 83.046, and the scores of several sub-tasks are leading in the industry, which is a big step towards the human level (85.61).

▲ The Pangu NLP model ranks first in the overall rankings of CLUE

In the NLPCC2018 text summary task, Pangu’s NLP large model achieved the industry’s best average Rouge score of 0.53, exceeding the second place by 60 percent.

Second, the Pangu NLP model precipitates a large amount of general knowledge in the pre-training stage, which can be used for both understanding and generation. In addition to end-to-end generation methods such as GPT-3, large models can also be used to identify intents by learning from a small number of samples, which can be transformed into knowledge base and database queries. Through the modular combination of functions to support the embedding of industry knowledge base and database, and then docking industry experience, so that the whole scene can be quickly adapted and expanded. For example, in the financial customer service scene jointly built by Huawei Cloud and Circulation Intelligence, Pangu NLP large model can better enable the sales link, help service personnel to quickly improve their business level and reshape consumer experience.

Third, Pangu NLP large model adopts the route of large model and small sample tuning, which realizes the surpassing GPT series in small sample learning task. For example, in the customer demand analysis scenario, when using Pangu NLP large model to produce semantic tags, the sample size required to obtain the target result is only one tenth of that of GPT series model, that is, AI production efficiency can be improved by ten times.

3 billion parameters, the world’s largest visual pre-training model

The Pangu CV Large Model is currently the largest visual pre-training model in the industry, with more than 3 billion parameters. For the first time, the Pangu CV large model combines the ability of image discrimination and generation, so as to meet the needs of low-level image processing and high-level semantic understanding at the same time. At the same time, it can facilitate the integration of industry knowledge fine-tuning, and quickly adapt to various downstream tasks. Pangu CV large model has excellent performance, and the classification accuracy of small samples on 1% and 10% ImageNet data sets has reached the highest level in the industry (SOTA).

Pangu CV model is committed to solving the problem of difficult generalization and replication of AI engineering, creating a new industrialization mode of AI development, and greatly saving research and development costs. In addition, Pangu CV large model provides model pre-training, fine-tuning, deployment and iteration functions, forming a complete closed loop of AI development, greatly improving the efficiency of AI development. At present, Pangu CV large model has been verified in more than 100 practical tasks such as medical imaging, finance, industrial quality inspection, etc., which not only greatly improves the accuracy of business testing, but also saves more than 90% of research and development costs on average.

Pangu CV large model power intelligent inspection of UAV

Chongqing Yongchuan Power Supply Company of State Grid is one of the early domestic power grid enterprises that applied UAV electric power intelligent inspection technology. The development of traditional UAV intelligent patrol AI model mainly faces two challenges: one is how to efficiently annotate massive data; Second, there are hundreds of types of defects, which require dozens of AI recognition models, and the development cost is high.

Huawei Cloud has cooperated with Chongqing Yongchuan Power Supply Company of State Grid. Compared with traditional development mode, Huawei Cloud Pangu CV large model has shown its strong advantages in the development of UAV intelligent patrol AI model.

In terms of data annotation, pangu CV model use the mass without annotation power data for training, combined with a small amount of labeled samples fine-tuning the efficient development mode, the originality in view of the electric power industry, this paper presents the preliminary training model, and makes the sample screening efficiency, about 30 times, screening of quality improvement about 5 times, yongchuan 50000 sharp image collected every day, for example, Can save manual labeling time 170 people days.

In terms of model versatility, combined with Pangu’s automatic data enlargement and category adaptive loss function optimization strategy, a model can be adapted to hundreds of defects, replacing more than 20 original small models, greatly reducing the maintenance cost of the model, increasing the average accuracy by 18.4% and reducing the model development cost by 90%.

The support behind the Pangu model

Pangu NLP model involves hundreds of billions of parameters and 40TB of training data, which poses great challenges to algorithm, computing power, massive data processing and parallel optimization.

In terms of algorithms, the algorithm team of Huawei Cloud and the NLP team of Recyphal AI have worked together to break through the difficulty of fine-tuning large models.

Pengcheng Cloud Brain II, the largest AI training cluster of Pengcheng Laboratory in China, has demonstrated its powerful AI computing power and data throughput ability in Pangu NLP large-model training, laying a solid foundation for Pangu large-model training.

On the other hand, Huawei’s underlying software, training framework and ModelArts platform are coordinated and optimized to fully release computing power and achieve optimal full-stack performance. First of all, for the underlying operator performance, based on HUAWEI CANN, the operator quantization, operator fusion optimization and other technologies are adopted to improve the single operator performance by more than 30%. Secondly, Huawei Mindspore innovatively adopts the multi-dimensional automatic hybrid parallelism technology of “pipeline parallelism, model parallelism and data parallelism”, which greatly reduces the workload of manual coding and improves the cluster linearity by 20%. Huawei Cloud ModelArts platform provides E-level computing force scheduling, combined with physical network topology, provides dynamic routing planning capability, and provides optimal network communication capability for large model training. In addition, the 40TB text data processing was completed in only 7 days with the help of the efficient processing capacity of ModelArts platform for massive data.

So far, Huawei Cloud has implemented and implemented AI in more than 600 projects in more than 10 industries across the country, helping cities, transportation, medical care, steel, textile, energy, finance and other industries to upgrade their intelligence. In the future, Huawei Cloud will continue to drive industrial intelligent upgrading through technological innovation.

Click on the attention, the first time to understand Huawei cloud fresh technology ~