To optimize the performance of the player, we should first know the complete process of playing and find the optimization plan from each stage of playing. Here’s the full flow:

  • To give a network URL, the first network request, how to optimize the network, involving all aspects of network optimization
  • After the network pulls back the data, it identifies the specific packaging format of the current video, which can be formal streaming video or ordinary video, and the optimization method is a little different
  • Identify the specific package format, according to the requirements of the package format, start to parse the package format, parse the audio stream, video stream, subtitles stream and so on
  • The audio stream is decoded into audio raw data and the video stream is decoded into video raw data
  • Pay attention to audio and video synchronization during decoding
  • The audio plays while the video starts rendering.

1. Play pain points

According to our usual development practice, we summed up several common problems in the playback process:

  • High playback failure rate
  • Playing the first frame is slow
  • Playing CARDS,
  • Playing videos consumes resources

Faced with these problems, we urgently need to know two aspects of data:

  • How are these problems monitored
  • How to solve this problem

These two problems have a progressive relationship, “how to monitor these problems” is to better “solve these problems”.

2. Monitoring methods

We know the pain points above, and when these problems occur, we need to collect the data to analyze them, otherwise the developers will be left to chance.

2.1 Monitoring network loading

Network request is a complex process. There are too many points in the whole link. If you collect all the points in the whole link, you can add the monitoring of the whole link in the player:In this way, we have a comprehensive grasp of the overall loading situation of the network, the occurrence of network loading problems, but also know which point of the problem, analysis and solution of the problem with more complete data.

2.2 Player link monitoring

The player’s current state of operation is also very important to developers: it can be broken down into points in time:When a player status exception occurs, the developer can clearly know the current status of the player.

2.3 Player fluency monitoring

In other words, when loading occurs during playback, the UI directly displays the rotation, which causes great damage to user experience. Users silently uninstall our APP in constant ridicule. Caton’s main reason is that the network is bad, and a small part of the reason is the source.

  • Catton times
  • Catton’s time
  • The network speed of Carton

The average number of times and time of a single play are important indicators to measure the smoothness of the play.

If the problem is the source, for example, when the video is playing, the progress bar moves but the picture does not, it is the video decoding problem, but there is no error, only the decoded data is wrong. There is really no effective way to optimize, you can monitor the size of the decoded data, find abnormal decoded data and report.

3. Optimize the success rate of playback

There are many reasons for playback failure. If a video is played using a Player, the developer will be notified of the playback failure in the player. onError callback, which returns at most one error code, corresponding to one playback error. In summary, broadcast errors can be divided into the following categories:

  • Network loading error: A network request has a problem, possibly at any stage of the network request.
  • Video format recognition error: The current format is not supported, or the current format recognition error
  • Decoder error: Error caused by unsupported current video or audio decoder, or problem caused by system CODEC exception
  • File I/O exception: The cache file cannot be read

Network loading errors generally depend on the situation, and a timeout retry mechanism is required for network timeout. Video format support FfMPEG can solve almost all video format recognition and processing. MediaCodec decoding is limited by the hardware of the phone, and decoding sometimes fails, which can be switched to soft decoding.

4. Optimize playback performance

4.1 Reusing Links:

Usually when brushing information stream videos, in fact, many of the domain names of videos are the same, and these links can be reused. The time for network connection is 30ms to 200ms. If you can reuse links, this part of time can be saved.

4.2 Preloading:

We’ll do a lot of preloading when we’re in the stream, but the way we usually do preloading is we instantiate a player to play the video, and then we instantiate a Player to do prepareAsync pull, and if we want to preload multiple videos, It’s just instantiating multiple Players and doing prepareAsync, which is bad.

A player instance holds a large amount of data. When the player is initialized, MediaCodec is initialized. MediaCodec corresponds to the underlying AVCodec and operates on the underlying /dev/codec-node. Android sets the maximum number of MediaCodec instances to 16. Of course, it varies from phone to phone, but in general, the number of MediaCodec instances is limited and there are no infinite instances. When we preload multiple player instances, we create multiple Codec instances, exceeding the limit and making the application and system prone to problems. We often find that the media.codec process causes SystemServer to freeze when we are trying to resolve problems. It is usually caused by improper use of Media.codec.

Can I preload the player instance now?

The purpose of our preloading is to request video resources, in fact, only need the network module can be. Local proxy can be implemented to separate the network module of the player:

  • The player does not interact directly with the video source server, but through the local agent layer
  • The network loading module of the local agent layer is player independent and can be initiated by the player or by other external calls.
  • The final setDataSource URL to the player is a request to http://127.0.0.1:port. The local proxy layer sends data to the URL through the Socket, and the player can parse the data stream directly. It’s exactly the same as the normal broadcast.

This way we can hold an instance of the player globally: it can be preloaded, and it can solve the problem of the player taking up too many resources. Kill many birds with one stone.

4.3 Specify the encapsulation format and decoding format:

For some videos, we already know the package format and the codec format of audio and video, so we can inform the player of this information in advance, and the player will use the specific package format to sniff, and directly use the specific decoder to decode.

For example, information stream video is basically MP4 package format, H264 video coding, AAC audio coding.

This saves us time on sniffing and MediaCodec searches.

4.4 MP4 Video Optimization:

The VIDEO in MP4 format is parsed as follows: THE MOOV contains the property data unique to the MP4 file, and the MDAT is the specific audio and video data. The MP4 format stipulates that the specific audio and video data in MDAT can only be parsed after the MOOV data is parsed.

However, mooV sometimes precedes MDAT and has time after MDAT, as above, mooV precedes MDAT, so there is no problem with our sequential requests. However, if the MOOv is behind MDAT, we can’t play it in sequence. At this time, we need to start double IO buffer loading: one retrieves mooV from the beginning, and the other retrieves MOOV from the end. Although the MOOV is found, we can parse mooV first, then parse MDAT, and play the video.

However, dual IO is time-consuming. If you can move the MOOV of MP4 video to MDAT before the server moves it, you can improve the first frame of MP4.

4.5 Streaming Video Optimization:

In addition to MP4 videos, there are some streaming video on demand, such as HLS format, which is composed of TS fragments one by one. For the optimization of the first frame of these videos, I suggest to directly compress the data of the first few TS fragments. For example, for a 3S TS video, the original resolution is 1280 * 720, but now it can be compressed to 320 * 180, greatly reducing the data volume, so that the first frame can be loaded quickly.

4.6 Side seeding

When we play the video, it’s best to cache it locally while playing it, so that when I open the video again, I don’t have to ask for it, I just reuse the local data.

Undercasting can be done using a local proxy.

4.7 Video-ID Cache Reuse

After we implement the side down play, we can reuse the cache after the second open, but we reuse it according to the URL of the video. Now the URL of the information flow video often changes, even if it is the same video, the url of the video will change within half an hour. Isn’t that kind of reuse inefficient? Fortunately, now all information flows are transmitted with a video-ID, which will not change with the change of the VIDEO URL. As long as the video is the same, the video-ID will not change. Then we can use this video-ID to achieve a cache and multiple reuse.

5. Other suggestions for optimizing the playback experience

5.1 Frame loss occurs during playback

Frame loss during playback mainly occurs in live applications. When frame loss occurs, the server should actively push the stream with low bit rate to prevent the client from losing frames seriously or even getting stuck. There are a lot of details on how to optimize the live broadcast, so I can write a new article.

5.2 Serrations appear in the Playback screen

There are two kinds of situations when a video is played: