This is the 14th day of my participation in the December Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

I mentioned h.264 in passing when I introduced streaming types in previous articles, but I recently started working on H.264 decoding again due to the streaming screen problem.

In the next few days, let’s try to solve this problem. Today, we will learn from the solution provided by the Streamer side and the nouns in the solution.

Streamer side of the solution given

There are two arguments on the Streamer side for the flowscreen problem:

  1. Streamer side changed to cache onlyThe SPS and PPS, not cache recentlyThe I frame, it is recommended that the SDK side receive the first frame beforeP frameThrow it all away, don’t send it to the decoder, so you can avoid losing the screen.
  2. Is there any SDK side?flush renderThe operation will display the amountbufferEmpty.

For those of you who know h.264, you must be familiar with I frame, P frame and B frame. Let me tell you what I understand about I frame, P frame and B frame:

I frame, P frame, B frame

  • inH.264There are three types of frames defined in the protocol, and the fully coded frame is calledThe I frameI refer to the previous oneThe I frameThe generated frame that contains only the encoding of the difference part is calledP frameThere is also a frame code that refers to the frame before and afterB frame
  • inH.264The image is organized in the unit of sequence. A sequence is a data stream encoded by an imageThe I frameHere we go, to the next oneThe I frameThe end.

I frame (key frame, basic frame)

  1. The I frameIs the intra-frame coding frame, represents the key frame;
  2. isP frameB frameReference frame;
  3. The I frameThe amount of information occupied by the data is relatively large (through this feature, the position of I frame can be found by naked eyes in the video stream transmitted by WS connection).
  4. When the decoder decodes toThe I frame, output or discard all the decoded data, find the parameter set again, and start a new sequence. This gives you an opportunity to resynchronize if there is a major error in the previous sequence.

P frame (differential frame)

  1. P frameThis is this oneP frameAnd the one before thatP frameIs the difference (that is, the difference frame), decoding needs to use the previous cache of the picture superimposed on the difference of this frame definition, to generate the final picture. (P frameThere is no complete screen data, only data that is different from the previous frame);
  2. P frameThe I frameEncoding frames separated by 1~2 frames toThe I frameIs the reference frame;
  3. Because it is a reference frame, it may cause the diffusion of decoding errors.
  4. High compression.

B frame (bidirectional differential frame)

  1. B frameRecord the difference between this frame and the previous frame;
  2. It’s from the frontI or P frameAnd the back of theP frameTo make predictions;
  3. Is not a reference frame, will not cause the spread of decoding errors;
  4. Highest compression ratio.

In fact, I frame, P frame, B frame are not limited to the above, the above are only some of their relatively simple features, for you to quickly understand what is I frame, P frame, B frame, so that we can facilitate the subsequent analysis and solution of the problem.

I frame is used as the base frame to predict P frame, and then I and P frame are used to predict B frame. However, this streamer requires only cache SPS and PPS, not cache the latest I frame, which makes me confused. In the next article, we may have a further understanding of SPS and PPS, and why we cache SPS and PPS instead of cache latest I frames in the subsequent analysis and decoding.