What is Web multimedia technology

With the popularity of short video in recent years, including major live broadcast, the technical learning of audio and video is put on the new heat. The audio and video technology of Web scene also got new application and breakthrough

Web front end

Interactive experience, front-end engineering, cross-end capabilities…

Digital multimedia

Audio and video principles, packaging containers, encoding and decoding algorithms…

  • On-demand According to customer requirements, return a fixed length of video, including short video (Douyin), medium video (B station), long video (TV series, movies)
  • Live broadcast pushes the user’s picture and sound to the client for other users to watch and browse
  • Picture picture distribution, download network link monitoring diagnosis, picture format compatibility adjustment, picture dynamic editing
  • Real-time communications High-quality low-latency audio and video communication capabilities, video conferencing, online education, interactive entertainment
  • Cloud games run on the server side, reduce the requirements of the client side, the low delay of audio and video requirements are very high
  • Video editing The decoding and synthesis of video

Basic knowledge of audio and video

The resolution we usually say is pixel, and there are 3 sub-pixels in each pixel, corresponding to the THREE RGB color channels respectively

Video is made up of a series of images that are played continuously for a certain amount of time at what we call the frame rate. For example, 24 frames is 24 images per second

Assume that 8bit represents a sub-pixel with sharpness of 1280*720, frame rate of 25fps and duration of 60s

Uncompressed video size = 8bit * 3 * 1280 * 720 * 25 * 60 = 3.9G

H264 compressed video size = 11MB

The compression ratio of 360:1

Therefore, we need to adjust the compression of the video after production, so that we can save traffic during transmission and download

Video coding:

Uncompressed video —-> encoder —-> Compressed storage and transmission

Video decoding:

Compressed video —-> decoder —-> restore the initial pixels

Assumptions on the left is a video in one frame, we found that in the blue sky, every little piece of information similarity is very high, such as the upper left corner of the four small pieces and can approximate the use of a small piece of information instead of the other three, that is to say, you just need to save a small pieces of information, can restore the four small pieces of information, so that you can save a lot of capacity, If we algorithmically eliminate the redundancy of each piece of information, the number of variables that need to be stored is very small.

Said the space within the scope of the above, we now look at the right time, within the scope of assumption is up and down two frames, so in these two frames, there are a large number of redundant information, including grass, football, venues, and so on, so we can through the algorithm, only the first frame, and then in the second frame only record information is different from the first frame, Then you can also achieve the purpose of saving storage.

Coding format development:

Our coding technology continues to evolve with demand, with higher compression rates, higher algorithm complexity and higher requirements on our hardware

H.264 is one of the most compatible browsers and is widely used

Here is the coding technique developed by Google

Container packaging format:

After the video is encoded by the encoder, the “naked stream” is obtained. If the video is played directly, it can only be played from the beginning to the end without intermediate operation. Therefore, we need to package the raw stream and put it in a fixed format file, which is the “container” of the audio and video raw stream, including meta information, duration, frame rate, size and position of data frame.

Native multimedia capabilities provided by the browser

We want to play audio and video in the browser, and the easiest way is to use the video and audio elements

<! DOCTYPEhtml>
<html>
    <body>
        <video autoplay=true controls="controls" width=600 height=300>
            <source src="//sf1-cdn-tos.huoshanstatic.com/obj/media-fe/xgplayer_doc_video/mp4/xgplayer-demo-720p.mp4">
        </video>
        <audio controls src="//sf1-cdn-tos.huoshanstatic.com/obj/media-fe/xgplayer_doc_video/music/audio.mp3">
        </audio>
    </body>
</html>
Copy the code

Effect: Decoding + rendering encoding format using H.264

Support video formats: MP4, MP3

<! DOCTYPEhtml>
<html>
    <body>
        <button onclick = "palyVid()">Play the video</button>
        <button onclick = "pauseVid()">Stop video</button>
        <button onclick = "loadVid()">Reload video</button>
        <video id="vs" src="demo.mp4"></video>
        <script>
            const myVideo = document.getElementById("vs")
            
            function playVid(){
                myVideo.play()
            }
            
            function pauseVid(){
                myVideo.pause()
            }
            
            function loadVid(){
                myVideo.load()
            }
        </script>
    </body>
</html>
Copy the code

<! DOCTYPE HTML > < HTML > <body> <button onclick = "getVolume "> </button> <button onclick = </button> <button onclick = "getCurTime()"> </button> <button onclick = </button> <video id="vs" SRC ="demo.mp4"></video> <script> const myVideo =" setCurTime()" </button> <video id="vs" SRC ="demo.mp4" Document.getelementbyid ("vs") function playVid(){alert(getVolume.play())} function pauseVid(){myVideo.volume = 0.2} function playVid(){ alert(myVideo.currenTime) } function pauseVid(){ myVideo.currenTime = 5 } </script> </body> </html>Copy the code

Disadvantages of video and audio

  • Video formats such as HLS and FLV are not supported

  • Requests and loads of video resources cannot be controlled by code

    • Segmented loading (save traffic)
    • Clarity seamless switching
    • Precise preloading

Major video sites use the video element

As you can see from the demo above, the request type of the video element is media, indicating that the request is made internally by the browser’s media element

The above YouTube video is used, but SRC does not directly assign the address of the video to SRC, but a resource link beginning with BLOB. The type of resource request is XHR, which is an HTTP request sent by the Web side through JS and can be actively controlled by JS code

MediaSource

Extension of browser video playback capabilities, support video fragment loading (directly into fMP4 fragments), instead of Flash player

Support playback MP4 (streaming playback), HLS, FLV and so on

It can realize segmental video loading, seamless resolution switching, accurate preloading and so on

YouTube uses this technique above

let mimeCodec = "video/mp4; Codecs = 'avc1.42 E01E mp4a. 40.2'"
let mediaSource = new MediaSource()
​
​
video.src = URL.createObjectURL(mediaSouce)
mediaSource.addEventListener('sourceopen'.() = >{
    let mediaSource = this
    let sourceBuffer = mediaSource.addSourceBuffer(mimeCodec)
    fetchAB('frag_bunny.mp4'.function(buf){
        souceBuffer.addEventListener('updateend'.function(){
            video.play()
        })
        sourceBuffer.appendBuffer(buf)
    })
})
​
function fetchAB(url,cb){
    let xhr = new XMLHttpRequest
    xhr.open('get',url)
    xhr.responseType = 'arraybuffer'
    xhr.onload = function(){cb(xhr.response)}
    xhr.send( )
}
Copy the code

The above is the structure of the MP4 video file, which is composed of one unit after another. The top layer is composed of three types of units

  • Information in fTYP format
  • Moov video source information, meta information, including sound track and other information
  • Mdat source data

This is the structure of the fMP4 video file. We find that the first two units are the same as MP4, but the second one is a paragraph, which saves the audio and video in sections

Use MSE to realize MP4 streaming playback

Request the resource first, then H.264 takes the raw stream, encapsulates it as FMP4, and puts it into mSE. Then it can play

The development and breakthrough of Web multimedia technology

Encrypted audio and video playback

We usually watch TV series or movies, video websites will do anti-theft processing, if not, you can directly use the inspection tool to check resources.

When we request, we deliver encrypted resources, which are downloaded to the client and then decrypted and played

Adaptive bit Rate (ABR)

When we watch videos at station B, there is an “automatic” option in the definition selection, which is to automatically switch to the state of low bit rate when our network environment deteriorates or the player cache is very high, so as to reduce video lag and enable users to watch videos as smoothly as possible

barrage

Bullet screen makes video websites glow with new vitality, bullet screen is now almost every video website must

  • There is no collision
  • Interactive barrage
  • Reduced as

Software solution: the Web end play H.265 format and domestic browser anti hijacking

Web browsers generally only support H.264, but the compression rate of H.265 format is relatively high, so H.265 demand is also very high, the general use of H.265 is the use of WebAssembly technology, which supports the COMPILATION of C language files into JS can be called, in fact, the COMPILATION of C language H.265 decoder. After converting it into a form that js can call, and then requesting the resource in the browser, it is decoded using the WebAssembly decoder. After decoding the raw pixel information of the video, it is rendered using WebGL

Audio is usually decoded and played by the AudioContext API to handle audio and video synchronization

Another way to use the soft solution is to prevent hijacking by domestic browsers, as follows, a domestic browser will replace the player, add some functions, so our original player API will be replaced

The principle of hijacking is to detect the video tag of the page. The soft solution is webGL rendering, which does not apply to the video tag

Push page flow

As you know, we need to download the software OBS when we broadcast. It gets the audio and video stream you want to send, and then packages it and sends it. Now you only need to open a web page to complete the push stream, which is very convenient

Image decoding

Some formats cannot be rendered by browsers due to different image formats, requiring engineers to decode and render images

Cloud gaming principles and features

Cloud game is the game running on the remote server, and then the server according to the running of the game, the audio and video screen to the client

Cloud gaming has two main features

  • No installation required, low hardware requirements
  • Low latency and strong compatibility requirements