“Not any cloud can hold that flow. There are two clouds in China, one is Ali Cloud and the other is Other Clouds.” “Alibaba Cloud is different. Ten years ago, from the first line of code, we built China’s only self-developed cloud operating system, Feitian,” said Zhang Jianfeng, Alibaba Group’s CTO, on the evening of November 11.



Zhang Jianfeng, CTO of Alibaba Group and president of Ali Cloud Intelligence

The 2019 Tmall Double 11 broke the world record again, with a new peak of 544,000 orders per second and a one-day data processing capacity of 970PB. This year alibaba’s core system went 100% cloud, supporting the world-class traffic peak of Double 11.

“Alibaba is an airplane in flight at high speed, and we managed to put on a new engine in the process.” Zhang Jianfeng said that Ali Cloud is the first company to use public cloud to carry such a trillion scale core system. “Many cloud manufacturers do not have their own business systems on their own cloud, so all ali cloud systems will be on Ali Cloud in the future.”

Alibaba is the first major Internet company in the world to run 100% of its core systems on the public cloud.

Attachment: Key points of Zhang Jianfeng’s speech

Starting from last year, we decided to put the whole Alibaba core system on the cloud, on the cloud also add an attribute, is on the “Ali cloud”, not on any cloud can hold the flow. In China, we often say that there are two clouds in China, one is called “Ali Cloud” and the other is called “Other clouds”.

Why is Ali Cloud different from other clouds? Because only ali cloud is completely from scratch in China research and development of a cloud, from each line of code from scratch, from ten years ago began to write, this is ali cloud of 10 years, so we have a special name of cloud, and is called a “flying operating system,” the original jian wang, ali cloud, to our scheduling system under the cloud, the cloud platform for the foundation, It’s called Flying OS.

Our cloud is a completely self-developed cloud in China. Many other clouds are transformed from open source software, which is very different.

Second, on November 11, Ali is still an airplane flying at high speed, and we successfully changed a new engine in the process. Before we put the non-core load on the cloud, now Alibaba’s most core system on the cloud. Now our cloud has turned the original dedicated technology into a public cloud, and everyone can enjoy universal services. We have the best system, and you can use it on top of us.

Alibaba Cloud hosts 100% of Alibaba’s core system, which is the first time we have done so in the world. Many cloud vendors have their own business systems, but their own systems are not on their own clouds. All of our ali systems are on the Flying operating system of Ali Cloud.

You have to wonder, is this a simple replacement or is this a very big technological advance, a very big challenge? Just to be brief, we did a couple of things really well, it wasn’t just a replacement, it was a very big performance improvement after going to the cloud.

Today, you may feel that in the first ten minutes, basically, consumers did not feel any shaking, and everyone’s shopping was very smooth.

First, on the core virtual machine system, we develop the Divine Dragon architecture by ourselves and use our own servers to do virtualization. The load capacity of a typical server gradually decreases as the load increases. It is not linear. The more pressure we have on the Divine Dragon server, the output is also very linear, which is very difficult to do.

Second, we developed our own cloud native database. This year, there are two databases, one is OceanBase, which we developed ourselves, and as you know, we took first place in the world in TPC-C testing. The second is PolarDB, which is also widely used in this Double 11. There is no problem, and our peak value per second is far beyond the original traditional database like Oracle.

Third, our computing and storage have been separated. Now Alibaba Storage has a place dedicated to storing data. Data is accessed remotely, so storage can be easily expanded because it has a dedicated pool. I couldn’t do it. Because the original remote storage, access to the network speed is not up to speed.

Fourth, the reason we can now do remote storage, read and write disks faster than local, is because we did the RDMA network, which was the first company in the world to do a large scale RDMA network.

We have made core breakthroughs in these four areas.

Today, feitian operating system can do scheduling in more than 100,000 servers. Today, Double 11 has a lot of applications, but it does not mean that at every point in time, all application loads are the same. We allocate most of our computer resources to applications like transactions at zero, and after a little while we allocate data to the data processing system. Data processing hit a new high this year, with 300PB of data processed on the same day in 2017, 600PB of data processed in 2018, and around 970P of data to be processed this year.

You may have no idea how much data a P actually has. Last time I talked with someone from CCTV, CCTV filmed TV news programs for so many years, and he saved about 80P of data over the decades. On November 11, we had to process 970P of data, a huge amount, which could not be sustained without an advanced system.

The second is very real-time, as you can see, today in seconds kill, in the venue are all personalized, thousands of faces, the data here is not only large, but also to do very real-time. So far, our Cainiao logistics system has generated more than 1 billion logistics orders, and the number is rising rapidly, all of which rely on the massive computing power behind Aliyun.

Our data this year, in addition to batch processing, we also have stream processing, which is real-time processing of all the data, like the transaction data that you see, it changes from minute to minute, from second to second, it’s not calculated from the database, it’s automatically collected layer by layer as each order is generated. We have a system that can process 2.5 billion records per second. It’s a streaming system.

We have so many servers, to manage all of it, in addition to the Flying system, but also to manage all the message flow, so we developed a message system called MQ, which is the largest message system in the world.

Today, from flying system, big data processing platform to intelligent application, these technologies stack up to create a new distributed cloud-based platform, which enables all core applications of Alibaba’s entire economy to run on it.

Finally, this year we released our own chip containing Light 800. Next year’s Double 11, a large number of Ali artificial intelligence applications will run on self-developed chips.

Today, from the Flying Cloud operating system to DpCA server, database, switch, switch operating system, RDMA network, all are developed by Ali itself. Today, we have accumulated very rich and very strong capabilities, from hardware, database, cloud computing operating system to the above core application platform, which is the biggest difference between this Double 11 and previous years.

All the unimaginable, will eventually become ordinary; We believe in “believe”, everything is new. Happy Double 11! **



The original link
This article is the original content of the cloud habitat community, shall not be reproduced without permission.