Develop weekly | 201 audio and video technology

A weekly overview of the dry goods in audio and video technology.

News submission: [email protected].

Tip: link jump only supports public number related links

HDR: create a visual feast for users

With the development of The Times, people are more and more dissatisfied with the limited color of the screen, and begin to study how to make the screen more similar to the real world. This LiveVideoStackCon 2021 Shanghai station we have invited Zhang Jiajie from Kuaishou Audio and video Technology Department. He starts with a few short stories to analyze why photos don’t perfectly reproduce the real world, and shares dry tips on HDR high dynamic range videos.

OneVPL with FFmpeg/GStreamer hardware codec

People are less familiar with the use of hardware Codec than software Codec. In this LiveVideoStackCon 2021 Shanghai conference, we have invited Xu Guangxin, a media engineer from Intel, to share with us Intel’s latest development in hardware codecs.

IETF interview: AS HTTP/3’s global share continues to grow, QUIC’s future looks bright

This article is a recent IETF interview with Lucas Pardue about QUIC standardization efforts. Written by IETF Blog reporter Grant Gross.

HTTP request merge and split technical details

This article conducted a simple experiment that attempted to analyze merge and split in HTTP through data, and whether concurrent requests affected other requests.

VVC fast affine motion compensation

VVC uses multi-type tree (MTT) to divide blocks, which provides a more flexible way of dividing blocks, but also greatly improves the complexity. Affine motion compensation (AME) based on this method increases the complexity. In this paper, the statistical characteristics of MTT and AME are effectively reflected by extracting features, and the AME process with redundant features is utilized to save the time of AME processing.

Overview of AI image/video codec in USTC

The paper, from the uSTC team, reviews representative work on image/video codec using deep learning.

zhuanlan.zhihu.com/p/379450898

Wechat small game live – Android cross-process rendering push stream practice

WeChat small for performance and safety, and a series of game, run in a separate process, in the environment will not initialize video broadcast related modules, that means little game of audio and video data must be across processes is transmitted to the main process flow, brought a series of challenges to us achieve little game live.

Cisco Webex and next generation video conferencing

The increasing use of video conferencing in People’s Daily life, especially the rapid growth of the video conferencing market due to the COVID-19 pandemic, has led to continuous updates of Cisco Netcom video technology. We are joined by Thomas Davies, Principal Engineer of Cisco’s Collaboration Technology Group, who shares with us the evolution of AV1, the challenges it faced in developing AV1, and the future of AV2 and its role in real-time communications.

VideoLab – High performance and flexible iOS video clips and effects framework

VideoLab is an open source, high-performance and flexible iOS video clips and effects framework that provides a more After Effects (Adobe After Effect) approach to use. The framework core is based on AVFoundation and Metal.

Principle and implementation of audio and video synchronization

This paper mainly describes the principle of audio and video synchronization, and common audio and video synchronization schemes, and code examples, to show how to take the audio playing time as the benchmark, the video synchronization to achieve audio and video synchronization.

AliCloudDenoise Speech enhancement algorithm: Enables real-time conference systems to enter the era of ultra-clear sound quality

In recent years, with the development of real-time communication technology, online meeting gradually become people indispensable important office tools in the work, according to incomplete statistics, about 75% for pure online meeting audio conference, which don’t need to open the camera and screen sharing function, voice quality and clarity of this meeting is important experience for the online meeting.

New Facebook achievement: HuBERT for self-supervised representation learning for speech recognition, generation and compression

To open the door to modeling these types of rich lexical and non-lexical information in audio, Facebook has introduced HuBERT, a new approach to learning self-supervised speech representation. HuBERT matches and even exceeds SOTA in speech recognition, speech generation and speech compression.

Video quality evaluation: Challenges and opportunities

This article is compiled from a speech by Wang Haiqiang, assistant researcher of Pengcheng Laboratory, on LiveVideoStack online sharing. Through his own practical experience, he explained in detail the challenges and opportunities of video quality evaluation.

Evaluate the video using the advanced video quality tool AVQT

This article is based on Pranav Sodhani’s Evaluate Videos with the Advanced Video Quality Tool in WWDC 2021. Pranav Sodhani is a member of Apple’s display and color technology team with expertise in algorithm development, machine learning, color science and video technology.

The world’s first open source image recognition system has gone live!

When it comes to image recognition believe that everyone has been very familiar with, this technology already deeply into every aspect of our life, small to face unlock, pay, punching, hotel, driving violations identified within the camera, online star with the graph SouTu, big to automatically in driving a car driving auxiliary, auxiliary diagnosis of medical imaging, Image and video analysis, editing, re-creation and so on…

Two yuan new play! Generate different styles of little sister cartoon image, skin color, hairstyle are variable

An input face image can generate diversified style animation image. Researchers at the University of Illinois at Urbana-Champaign have done just that. Their new GAN migration method achieves a one-to-many generation effect.

What is the development of object detection? | CVHub take you talk about the development of the target detection this 22 years

The field of target detection has been developing for more than 20 years. From the early traditional methods to the current deep learning methods, the accuracy is getting higher and faster at the same time, thanks to the continuous development of deep learning and other related technologies. This paper will make a systematic introduction to the development of target detection field, aiming to build a complete knowledge architecture for readers, and understand the target detection related technology stack and its future development trend.

Half-life: Alex: What’s so hard about developing VR hand interactions?

Japanese gaming site Kotaku recently sat down with Half-Life: Alex hand interaction developer Kerry Davis to find out what other directions he explored while developing the game, and what details were difficult for players to detect while improving the experience.

The success of self-driving cars depends on teleoperation

Teleoperation technology is a technical means to achieve the remote interaction between people and controlled objects. The control end of teleoperation is local, and the execution end is somewhere in remote space that cannot be directly sensed locally. The technology is now mostly used in robots. Teleoperation usually means remote operation. Teleoperation is also promising in self-driving cars. Because for now, at least for the next 10 to 20 years, autonomous driving is not going to be completely humanized, and will require human intervention. At present, the management of nuclear power plants or the flying of airplanes in the world are manned by human intervention, rather than 100 percent controlled by artificial intelligence.

CVPR | 2021 tesla pure visual automated driving the latest progress

At the CVPR 2021 Self-driving Workshop, Andrej Karpathy, Tesla’s DIRECTOR of AI, spoke about the latest developments in Tesla’s pure vision, including Autopilot and FSD.

Activity recommended

20% off for tickets purchased at **** until July 4. Click ** [read the text] or scan the QR code in the picture ** for details.

Illustration is derived from __Pexels

This article uses the article synchronization assistant to synchronize

Develop weekly | 201 audio and video technology

Related Posts

Why does The Ali Developer manual prohibit SELECT * altogether?

What are Linux process, thread, and file descriptors

15 Java NIO Path- Translation