Does video have boundaries?

In the past, the answer was yes.

Video would have been locked away in the TV, on the big screen. But with more and more hardware devices into the home, with the network technology layer by layer change, with the continuous upgrade of computing power, with the video codec ability continues to improve…… Video has become a new information carrier, and as the base of the big video industry in the new era, video cloud has been given a key mission to change the society.

On July 10, “Imagine” — 2021 Alibaba Cloud Video Panorama Innovation Summit and Global Video Cloud Innovation Challenge Final Award Ceremony was held in Beijing. With regard to the future development prospects of video, the panoramic blueprint of video cloud, the multi-community interaction of academia, art and venture capital, as well as the multi-dimensional exploration of developers and audio-video technology, we seem to have been able to see a bright road for the future development of video through the collision of views of this summit.

From video to hypervideo, video cloud role augmenting

In the last few years, the term video-ification has been used more and more. So what is video-ification?

To put it simply, the transmission of information is gradually adding video as a carrier. Due to the continuous reduction of the threshold of video production, the continuous improvement of user acceptance and the continuous increase of user usage time, the era of full video content has arrived. At the same time, video is generating new demand not only in the consumer sector, but also in education, conferences, health care, finance and other industries.

The overall time spent by users on video is obviously increasing, and the interaction of the whole society based on video in a variety of business scenarios is also increasing significantly. “Content is evolving more towards video, and the forms of interaction are more diverse. This is a hypervideo era compared to the previous video era.” Lin Hao, Alibaba researcher and head of Alibaba Cloud Intelligent Video Cloud, thus defines the current changes.

If you want to define an era, you need to understand it. Lin explained that the era of hypervideo has five characteristics: hypercontent, hyperinteraction, hyperlink, hyperlanguage ability and hypervision of the future. Analytically, it means that the form of video continues to evolve, the interaction is more rich, its transmission transcend language restrictions, and it can also affect the daily life of the public through AR, VR and other ways.

So how did the age of hypervideo come about? Lin Hao believes that 5G has played an important role in promoting the development of AI and IoT into intelligent networking due to its large bandwidth. 5G activates ultra-high-definition video and VR/AR, making the network peak rate reach 20Gbit/s, wireless interface delay 1ms, and resolution significantly improved; More importantly, 5G has opened up new forms of digital content. Whether it is digital games, interactive entertainment, film and television animation, stereo image or digital performance, the expression ability and form of video have been greatly enriched.

More importantly, cloud + video forms a catalyst for scene innovation, making the combination of virtual and real possible. The integration of cloud edge and end makes the edge computing force move up and the cloud computing force sink, which reduces the processing pressure and time delay. The integration of audio and video technology in the cloud makes it possible to have the same experience at both ends. After the development of AI technology, the full video link is enabled, and the intelligence subverts the previous content production mode. At the same time, mixed reality technology also breaks through the new form of content and interaction, making the last barrier between the physical world and the digital world broken and linked, making the carrier of video have more possibilities.

Steve Jobs once said, “With a lower bandwidth, people are transmitting information, and a higher bandwidth will be used to transmit emotion.” The hyper-video era has been born not just because bandwidth has increased, but because technology has evolved.

The evolution of technology is divided into two directions, namely the evolution of content and the evolution of interaction. The evolution path of content follows four features of greater density, more dimensions, more senses, and topological space, and its concrete manifestations form the video-oriented form of text, image, audio and video, live short video, information and knowledge, and even the video-oriented form of the whole scene content, and finally the form of immersive content. The evolution of interaction follows the characteristics of multi-terminal link, multi-person sharing, space breaking, and seamless integration of virtual and real. Thus, the evolution path forms the process from offline, online and interactive full scene online to immersive interaction.

It’s not hard to see that immersive interaction and content form are the real future that we can explore. “Information will flow naturally from one interaction object to another. And the digital will coexist and enhance with the physical.”

An interactive approach like Ready Player One is not out of the question. Of course, behind all the imagination is the deep mining of technology. Behind the video is not the upgrading of AI, data, codec and other single technologies, but the building of the entire technology system based on the video cloud. Video cloud is not only the technology of cloud, but also the continuous evolution of the overall technology of video. No matter three-dimensional or holographic, it should continue to evolve and layout, and finally make the video more combined with the scene, so as to realize the “innovation on cloud and value creation” of digital intelligent audio and video.

The base of big video industry, video cloud industry evolution

With the development of hypervideo, the Internet is also developing. And the value of an industry is no longer measured by equipment, but by time. While dividends in all areas of the Internet have nearly dried up, video related sectors showed huge dividends last year. And Xu Fanlei, deputy general manager of iResearch, said the dividends will continue.

From the perspective of industry development, the present stage of the video industry is fragmented, decentralization, Gao Qinghua, real-time and so on a series of characteristics, that is in demand side for the pursuit of video became more “short, frequency, fast”, the pursuit of perfection quality experience, the need of real-time audio and video, real-time interactive remodeling video application value, So it covers financial services, health care, public utilities, social networking, education, consulting and many other industries.

And if we take a higher perspective and look at the transmission of information in human history, then video has played a very important role. Initially, human communication was more about body language, which is physically demanding and ambiguous; Later, human beings had language, which was not physically problematic but was bound by space and time and difficult to pass on. Then we had words, which had been passed down for thousands of years, but the natural threshold of words and the lack of information richness led to the emergence of video. And video has continued to evolve, from the original television, to offline player video, to now live audio and video and interactive video.

However, video is still not perfect. There are two main problems. The first problem is the linearity of video, which can fast forward to a certain place but cannot realize the global overview. The second is that the revision is slower and more difficult than the text. And based on these problems, the industry will be more and more to combine with video. That is to say, video is no longer an industry, but a kind of underlying basic ability. Video applications based on video cloud will become a necessary option. Video has become a necessary option, so it can be said that “video cloud is the base of the big video industry in the new era”.

The deep combination of industry and video is not only affecting products, but also changing the pattern of many industries. However, due to the complexity of the industry, their demands for video capabilities are different but have something in common. The first requirement is easy to integrate, easy to measure, it needs to be at a lower cost, more flexible scaling capacity to achieve the trial and error on the cloud, can be quickly put into production.

Therefore, video cloud needs to provide different solutions and process support in different links such as production, processing, transmission and consumption. In addition to the depth and specialization of the video itself by cloud services, it can also significantly lower the barriers to making high-quality, valuable video.

In this process, cloud service is extremely important for video support. In the video production link, video cloud can provide intelligent content processing capacity, greatly improve the creation efficiency and realize efficient media resources management. In the processing segment, the video cloud achieves the optimal balance between cost and picture quality through video processing and intelligent coding. In the transmission link, the video cloud is intelligently accelerated based on CDN, and the cloud side is coordinated to reduce transmission delay and save bandwidth cost. In the final consumption link, video cloud can also provide beauty, bel canto, immersive interaction and other diversified gameplay to enrich user experience.

The video cloud itself continues to evolve as it integrates with the industry. At present, although the video cloud is mainly concentrated in the Internet and the pan-entertainment field, it has the ability to provide support in different links, and can continue to evolve and develop in various industries. At the same time, the video cloud solution also gives users more choices. No matter it is the application-level capability or the general platform enterprise of the industry, different dimensions and different users can have different answers.

In addition, the video cloud is still pursuing the ultimate in technology. Although it has not really matured to solve the problems of high-definition, real-time and interactivity, the concept of software defining everything is cooperating with hardware to deal with many links, such as router, storage, computing, etc. At the same time, low code development also appears in a large number of video cloud and video industry, which enables practitioners to call functions more quickly and nimbly, improve usability, and realize easy call and easy integration.

In the future, the video-based cloud is likely to create more innovation, which can provide users with more links, lower barriers to entry, and more universal power. Video cloud technology for the video industry as a whole and large video industry, is to become a base function.

Sustainable development of video cloud, technical difficulties and breakthroughs

Video cloud as the base of industry, one of the major characteristics is compatible and package. Especially at present, users’ demands for video interactivity, presentation mode and immersive experience increase, and the deep integration of AI will become the key to the innovation of video cloud and video industry. While video cloud is expanding in social networking, entertainment, education and other fields, deep learning also continues to play a huge role in image, voice, language, big data feature extraction and other aspects. It can be said that future breakthroughs in video cloud technology will be driven to some extent by artificial intelligence based on deep learning.

Round table at the end of the activities in the BBS, intelligent information processing laboratory, institute of computing technology, Chinese Academy of Sciences, said researcher Wang Shuhui, deep learning age brought the rise of artificial intelligence for the third time, the rise mainly used for the purpose of making deep learning technology has a good effect in many tasks, but its kernel problems. Therefore, in order to achieve a breakthrough in video technology, three major technical problems should be solved from the perspective of the internal mechanism of deep learning.

  • First, the existing deep learning relies too much on data, and its data processing performance and knowledge utilization are not enough. Therefore, knowledge construction of network multimodal and cross-media data based on this consideration will be an important development direction in the future.
  • Second, a knowledge base should be built to support the reasoning of the machine system, so that the machine can draw inferences from any data from different sources.
  • Third, in the early days, people were not equal to computers, such as human-computer collaboration in content creation. In the core process, algorithms, systems and people need to be trusted, and mutual trust, collaboration and reliable reasoning will be the main problem to be solved.

Of course, AI has many problems, but it also plays an important role in video. Xie Xuansong, a senior algorithm expert at Damo, said that the role AI plays in video is mainly divided into two categories. The first category is the most basic video or image understanding, including classification, marking, detection, segmentation and so on. The second category is related to production, such as production, editing, processing, erasure, erasure, etc., which also includes the underlying visual related enhancement, etc.

Image enhancement of video is a major application direction of AI. When the resolution is low, the video information experience will be very poor, and more vivid colors will also enhance the experience. More immersive experiences are the way to go. If you want to build a 4K content, detail, fluency, and color are important things to focus on, for example. However, from the technical point of view, the following three problems must be directly faced. First, the more details are pursued, the more likely defects will appear. How to ensure the restoration of details and the control of defects is a very core technology. Second, the source of the algorithm is data. There are two kinds of data sources, such as low resolution and high resolution, low image quality and high image quality. Data acquisition often needs to be solved by manual means in a high-cost way, which is also a major difficulty. Third, in the practice of AI technology, it is also a problem to strike a good balance between effectiveness and efficiency.

At present, AI is also moving towards two dimensions: one is to serve consumers, and the other is to go deep into all walks of life to reduce costs and improve efficiency and create various opportunities.

Of course, it is still people who drive innovation and technological upgrading in the final analysis. AI has been popular for many years, and many schools have started AI-related talents and education. However, for the market and industry, the shortage of talents is still serious. So where are the talents? Wang Shuhui said that most of his postgraduates have joined the industry battlefield, and the university has provided a large number of talents to the industry. However, because the industry is developing so fast, high-level talents are scarce, and different laboratories have different positioning, so it is impossible to expand the scale blindly.

At the same time, laboratory research is to separate the problems from the reality and solve the problems through mathematical methods. However, enterprises have different requirements for students. They expect enterprises to understand the business and apply them to practice. There is a long chain from academic research to business application itself, which makes it difficult for students to implement plug and play. And it’s not just schools that are clearly aware of this, it’s also industries and companies.

This year, Alibaba Cloud and Intel hosted, and Youku strategic technology cooperation of the global video cloud innovation challenge held in the summit final award ceremony. The contest by tianchi platform and ali cloud video cloud, focusing on video cloud technology in the field of application and innovation of the industry, attracted 23 countries around the world, more than 4000 teams participating team, competition is divided into “algorithm” and “innovation” two series, fully explore talents, encourage and look forward to the contestants inspire more imagination in the future.

In addition, ali YunTianChi platform at the summit also released tianchi data sets, open source project, covers electricity, finance, logistics, healthcare, energy and so on more than 60 have real business scenarios industry scarce data sets, hope that through open real business scenarios and data, and all the social forces to create a professional scientific research data platform.

The development of video cloud has become the choice of The Times, but also changed the business and society, into a large video industry base; Video cloud technology can be full of imagination, break through time and space, and make human communication more seamless and comfortable.

The future is here. Are you ready for the new world of video?


All the speech contents of this Video Cloud Panorama Innovation Summit will be released in the “Video Cloud Technology” official account.

“Video cloud technology” is your most noteworthy public account of audio and video technology. Every week, you will push practical technical articles from the front line of Ali Cloud, where you can exchange ideas with first-class engineers in the field of audio and video. Public number backstage reply [technology] can join Ali cloud video cloud product technology exchange group, and the industry’s big names together to discuss audio and video technology, get more industry latest information.