The problem

  • Differences in main processes

  • Buffer design

  • Memory management logic

  • Audio and video playback mode

  • Audio and video synchronization

  • Problems with seek: buffer flush, playback time display, inaccurate positioning when k frame spacing is large…

  • How do I release resources when I stop? Do I switch to a secondary thread?

  • When the network is not good, for example, when the speed of frame acquisition is slower than that of consumption, if it is not paused, it will be stuck consistently. Is it paused actively?

  • How are VTB decoding and FFMPEG decoding unified? How is the architecture designed?

The data flow

See ijkPlayer main process analysis in more detail

audio

  • av_read_frame

  • packet_queue_put

  • audio_thread+decoder_decode_frame+packet_queue_get_or_buffering

  • frame_queue_peek_writable+frame_queue_push

  • Audio_decode_frame + frame_queue_PEek_readable, data to IS -> Audio_buf

Sdl_audio_callback, which imports data into stream. This function is the upper library of audio playback buffer filling function, such as using audioQueue in iOS, the callback function IJKSDLAudioQueueOuptutCallback call to them and then put data into the audioQueue.

video

The same as reading the packet part

Video_thread, then ffPIpenode_run_sync hard decoded to locate the VideoToolbox_video_thread, then ffP_packet_queue_get_OR_buffering read.

SortQueuePush(CTX, newFrame); Load the decoded pixelBuffer into an ordered queue.

GetVTBPicture takes the frame wrapper out of the ordered queue, meaning that the ordered queue is just a temporary sorting tool. This idea can be absorbed; Queue_picture, put the decoded frame into the frame buffer and display video_refresh+video_image_display2+[IJKSDLGLView display:]

The final texture generation is in render, pixelBuffer for VTB, in yuv420sp_vtb_uploadTexture. With render, everything is abstracted out. Shader in IJK_GLES2_getFragmentShader_yuv420sp

Conclusion: There is no significant difference in the main process.

Buffer design

packetQueue:

1. Data structure design

PacketQueue uses two linked lists, one for storing data, and the other for multiplexing nodes, which store those nodes without data. Insert data from first_pkt to last_pkt and fetch data from first_pkt. Recycle_pkt begins with recycle_PKT, empty nodes after taking data are put into the head of recycle_PKT, and then the empty node becomes the new recycle_PKT. When storing data, also reuse a node from recycle_PKt. The nodes of the linked list are like boxes. When the data is loaded, it is put into the linked list, and when the data is retrieved, it is returned to the reusable linked list.

2, access blocking control

It may not be available when fetching data, so there are several options: return directly, block and wait. What it does here is block and wait, and it pauses the video. So this answers question 8. The effect you see outside is: when the network freezes, it stops playing, plays smoothly for a while, and then continues to stall. The play and the stall are clearly separated. There is no blocking control when the data is imported, why does the data not expand infinitely? There is a block, but the block is not in the packetQueue but in the readFrame function:

if(ffp->infinite_buffer<1 && ! is->seek_req && (is->audioq.size + is->videoq.size + is->subtitleq.size > ffp->dcc.max_buffer_size || ( stream_has_enough_packets(is->audio_st, is->audio_stream, &is->audioq, MIN_FRAMES) && stream_has_enough_packets(is->video_st, is->video_stream, &is->videoq, MIN_FRAMES) && stream_has_enough_packets(is->subtitle_st, is->subtitle_stream, &is->subtitleq, MIN_FRAMES)))) {if(! is->eof) { ffp_toggle_buffering(ffp, 0); } / *wait 10 ms */           SDL_LockMutex(wait_mutex);           SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);           SDL_UnlockMutex(wait_mutex);           continue;       }Copy the code

To simplify it:

  • Infinite_buffer is not an infinite buffer

  • Is ->audioq.size + is->subtitleq.size > FFP ->dcc.max_buffer_size

  • Stream_has_enough_packets Indicates the limit on the number of data

Because the number is set to 50000, it generally cannot be reached, but the data size is limited to about 15M. There are two essential points here:

  • The data size is used as the limit, because for different videos, the resolution problem will lead to a huge gap in the same packet, but what we actually care about is the memory problem.

  • Pause 10ms instead of infinite pause waiting for conditional lock signal. It is simpler by design and avoids frequent wait+signal. This is a bit of a question to ponder, but it feels intuitively good.

frameQueue:

The data is stored in a simple array, which can be thought of as a ring, with data in one segment and no data in the other. Rindex indicates the start index of data and the read index of data. Windex indicates the start index of empty data and the write index of data. Size indicates the current data size, max_size indicates the maximum number of slots, and if size is full, the block will wait. Size is empty when read, also blocks wait. Rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown: rindex_shown We’ll see later. Conclusion: The buffer design is completely different from mine, but both use the concept of reuse, and the nodes are boxes in which the data is wrapped. It is not easy to compare performance, but my design is more perfect. Frame and packet use a unified design, including sorting function. Memory management

The management of the packet

We get the initial value from av_read_frame, the reference number is 1, and packet is connected using a temporary variable, namely stack memory. Then join the queue, pkt1-> PKT = * PKT; The packet is stored as a value copy so that the data in the buffer is separated from the temporary variables outside.

Packet_queue_get_or_buffering retrieves the packet, again using value copy. Finally, av_packet_unref is used to release buF associated with packet, while temporary variable packet can continue to be used.

Note that avcodec_send_packet returns EAGAIN, indicating that the new packet cannot be received yet.

d->packet_pending = 1; av_packet_move_ref(&d->pkt, &pkt);Copy the code

Save the packet to d-> PKT. In the next loop, fetch the frame first and then fetch the packet back.

if(d->packet_pending) { av_packet_move_ref(&pkt, &d->pkt); d->packet_pending = 0; }Copy the code

Maybe this happens when there is a B frame, because B frame depends on the later frame, so it won’t decode, and when the later frame comes in, there will be multiple frames to read. The decoder should not accept the new packet. The iJkPlayer code doesn’t seem to work that way, because instead of reading the frame one at a time, the EAGAIN error is unknown. For inspection. In addition, the av_packet_move_ref function is completely copied, the source value is completely moved to destination, and the source is reset. The buF reference count does not change.

Memory management for video frames

In ffplay_video_thread, frame is a pair of memory that reads from the decoder to frame using get_video_frame. Use av_frame_unref to free the memory of the frame buF, but the frame itself can continue to be used. Without error, av_frame_unref is also called to ensure that every frame read is unref, which corresponds to initialization. An important principle for managing memory using reference indices is one-to-one correspondence.

Because we’re just taking the frame and putting it in the buffer, we’re not ready to use it yet, and if the BUF gets released, then by the time it plays, the data is lost, so what does that do? Put it in a buffer in queue_picture, go to SDL_VoutFillFrameYUVOverlay, and this function will go to the top layer and do different things depending on the decoder, Take func_fill_frame of ijksdl_vout_overlay_ffmpeg.c as an example. There are two treatments:

One is overlay and frame shared memory, so it just displays the memory directly using frame, and it’s YUV420p and that’s what it is, because OpenGL can display images directly in this color space. In this case, just add a reference to the frame and make sure it doesn’t get freed. Av_frame_ref (opaque->linked_frame, frame);

Opaque -> Managed_frame, and convert to opaque-> Managed_frame. The data goes to the new location and the original frame is useless. Without the ref, it will be released naturally.

Audio frame processing

In audio_thread, the decoder_decode_frame is constantly fetching new frames. Just like in the video, frame is in memory, and after reading the decoded frame, reference 1. The audio format conversion is in the play phase, so here we simply store frame: AV_frame_move_ref (AF ->frame, frame); . A copy was made to move the read frame into the buffer. Frame_queue_next contains av_frame_unref when the frame buffer is fetching data. Same thing with this video. One problem is that the frame must be alive when the audio is read, because if the audio is not formatted, it is read directly from the frame. The frame can be released only after the player data is filled. Unref is on frame_queue_next, and this function will only happen on the next read of the frame, which will happen after the current read of the frame, so it will release the frame after the read of the data, and that’s fine.

(is-> AUDIo_buf_index >= IS -> Audio_buf_size) {audio_size = audio_decode_frame(FFP);Copy the code

Audio and video playback mode

  • Audio playback using AudioQueue:

  • Build AudioQueue: AudioQueueNewOutput

  • Start AudioQueueStart, pause AudioQueuePause, and end AudioQueueStop

  • In IJKSDLAudioQueueOuptutCallback callback function, call the lower filling function to populate the AudioQueue buffer.

  • Use the AudioQueueEnqueueBuffer to queue the assembled AudioQueue Buffer to play.

Above these are AudioQueue standard operation, special is to build AudioStreamBasicDescription, also is to specify the format of the audio playback. Format is determined by the format of the audio source, in IJKSDLGetAudioStreamBasicDescriptionFromSpec look, in addition to the fixed format for PCM, the other is from the bottom to the format of the copied. This gives you a lot of freedom, and the audio source just needs to be decoded into PCM. The underlying format is determined in audio_open, and the logic is:

From the source file, build a desired format, wanted_spec, and provide the desired format to the upper layer. Finally, the upper layer’s actual format is returned as the result. An operation similar to communication, this kind of thinking is worth learning from

If the upload does not accept this format, an error is returned, and the bottom layer changes the number of channels, the sampling rate and then continues communication. But the sample format is fixed to s16, that is, signed INTEGER 16, which is a signed int with a 16-bit depth. Bit depth refers to the size of memory stored in each sample, 16 bits, plus signed, so the range is [-2^15, 215-1],215 is 32768, which is variable enough.

Because it is PCM, it is uncompressed audio, so the decisive factors are only: sampling rate, number of channels and sample format. Sample format fixed S16, and upper communication is to determine the sampling rate and channel number. Here is a good example of a layered architecture, common at the bottom and different at the top depending on the platform.

Video playback:

Play all OpenGL ES, IJKSDLGLView, rewrite the layerClass, change the layer type to CAEAGLLayer to display OpenGL ES rendering content. Render all types of screen use this display, there are differences are abstracted into the role of Render, related methods are:

  • SetupRenderer builds a render

  • IJK_GLES2_Renderer_renderOverlay Draws overlay.

Render’s build includes:

  • Build programs using different Fragmnt shaders and the common Vertex shader

  • Provide MVP matrix

  • Sets vertex and texture coordinate data

Render includes:

  • Func_uploadTexture locates to different render and performs different texture upload operations

  • Draw graphics using glDrawArrays(GL_TRIANGLE_STRIP, 0, 4); , uses the primitives GL_TRIANGLE_STRIP instead of GL_TRIANGLE to save vertices.

The method of providing texture is also important, the difference is the color space and the arrangement of elements:

There are three RGB types available: 565, 888 and 8888. RGB elements are mixed together, that is, there is only one layer (plane). 565 refers to the number of bits occupied by the three ELEMENTS of RGB. similarly, 888,8888 includes alpha element. So 565 is 2 bytes per pixel, 888 is 3 bytes, and 8888 is 4 bytes.

glTexImage2D(GL_TEXTURE_2D,                    0,                    GL_RGBA,                    widths[plane],                    heights[plane],                    0,                    GL_RGBA,                    GL_UNSIGNED_BYTE,                    pixels[plane]);Copy the code

The difference when building a texture is the format and type parameters.

  • Yuv420p, this refers to the most commonly used y, U, v3 elements are all open, divided into three layers, and the quantity ratio is 4:1:1, so the size of the texture of U and V are half of the height and width of the texture of Y. And since each component has a texture, each texture is single-channel, using the format GL_LUMINANCE

  • Yuv420sp, this yuV ratio is also 4:1:1, the difference is that U v is not separated from two layers, but mixed in the same layer, layered is uuuuVVvV, mixed is uvuvuvuv. So build two textures, y unchanged, uv texture using double channel format GL_RG_EXT, also 1/4 size of Y (1/2 width and height). Fragment Shader this fragment shader has different values:

// layer 3 yuv.y = (texture2D(us2_SamplerY, vv2_Texcoord).r-0.5); Yuv. z = (texture2D(us2_SamplerZ, vv2_Texcoord).r-0.5); Yz = (texture2D(us2_SamplerY, vv2_Texcoord).rg-vec2 (0.5, 0.5));Copy the code

Uv in the same texture, texture2D takes the rG component directly.

  • I don’t understand yuV444p very well. Looking at the Fragment Shader, it seems that each pixel has two versions of YUV, and then made an interpolation. The last one is YUV420P_VTb, which is the display of the data hard-solved by VideoToolBox. Since the data is stored in CVPixelBuffer, the texture construction method of iOS system is directly used.

OpenGL ES in iJkPlayer is version 2.0, if using version 3.0, dual channel can use GL_LUMINANCE_ALPHA.

Audio and video synchronization

First look at the audio, audio does not do blocking control, the upper layer of the player will be filled with data, there is no time to do not fill operation. So the audio clock is supposed to be the default master control, so the audio is not processed.

  1. Time control for video display

The control of the video is in video_refresh, and the playback function is video_display2, and going in here means that time is up, it’s time to play, and that’s a check point. There are a few parameters to know:

  • Is ->frame_timer. This time represents the time when the last frame was played

  • Delay indicates the time difference between one frame and the next

if (isnan(is->frame_timer) || time < is->frame_timer){

is->frame_timer = time;

}

The playback time of the previous frame is after the current time, indicating data error. If (time >frame_timer + delay) {* Remaining_time = FFMIN(is->frame_timer + delay – time, *remaining_time); goto display; }

Is ->frame_timer + delay Indicates the time for the current frame to play. If the time is later than the current time, it indicates that the frame is not ready to play. Error: Goto display does not play because there is a judgment in the display block, which has an IS ->force_refresh in it. This value defaults to false, so skip to display, which actually means do nothing and end this judgment. Otherwise, if the playback time is earlier than the current time, it should be played immediately. Is ->frame_timer += delay; And then all the way to the back, is->force_refresh = 1; That’s when it’s actually playing.

From the above two sections, we can see the basic flow: at the beginning, the playing time of the current frame is not up, and the goto display waits for the next loop. The loop is repeated for several times, and the time is moved later. Finally, the playing time is up, the current frame is played, and the frame_timer is updated to the time of the current frame. Then repeat the process to play the next frame. Then there is the question: why is frame_timer updated with delay instead of being equal to the current time? Frame_timer +delay = frame_timer+delay = frame_timer+delay = frame_timer+delay = frame_timer+delay = frame_timer+delay = frame_timer+delay = frame_timer+delay And every frame is going to be like that, and every frame is going to be a little bit bigger, so there’s going to be a big difference overall.

if(delay > 0 && time - is->frame_timer > AV_SYNC_THRESHOLD_MAX){ is->frame_timer = time; }Copy the code

When the frame_timer is behind, the state can be corrected directly by referring to the current time, and the subsequent playback will be on track again.

  1. Synchronization clock and clock time correction

The concept of synchronous clock: if the content of audio or video is played correctly and completely, a certain content and a time are one-to-one corresponding, where the current audio or video is played, it will be represented by a time, which is the time of the synchronous clock. So the audio clock tells you where the audio is playing, and the video clock tells you where the audio is playing. Because audio and video are performed separately, the progress of audio and video may be inconsistent. In the case of the synchronization clock, the value of the two synchronization clocks is different. If the two clocks are unified, it is an audio and video synchronization problem. Because of the sync clock concept, synchronization of audio and video content can be simplified to more accurate: audio clock and video clock have the same time. At this point, a synchronous clock acts as the master clock, that is, the other synchronous clocks adjust their time according to this master clock. When it’s full, speed it up. When it’s fast, slow it down.

Compute_target_delay diff = get_clock(&is->vidclk) -get_master_clock (is); If (diff <= -sync_threshold) delay = FFMAX(0, delay + diff); if (diff <= -sync_threshold) delay = FFMAX(0, delay + diff); Else if (diff >= sync_threshold && delay > AV_SYNC_FRAMEDUP_THRESHOLD) delay = delay + diff; else if (diff >= sync_threshold) delay = 2 * delay;

As for why they are not delay + DIff, that is, why there is a third case 1, my guess is that the delay is directly added with DIff, so the difference between the video type and the main clock will be directly corrected in the next frame. However, the difference may have been quite large, and the direct one-step correction will result in the following effect: There was a noticeable pause, and then the sound continued until the video was synced. However, if the method of 2*delay is adopted, each delay is corrected and the difference is gradually corrected for many times, the change may be smoother. The effect is that the picture and sound are normal, and then the sound gradually catches up with the sound, and finally synchronizes. It’s hard to say why a one-step fix was chosen in the second case and a gradual fix in the third. Since AV_SYNC_FRAMEDUP_THRESHOLD is 0.15, which corresponds to a frame rate of around 7, at this point the video is basically slideshow, so I guess there is no point in gradually fixing it.

  1. The realization of synchronizing clock time acquisition

Get_clock gets the time, set_clock_at updates the time. Pts_drift + time – (time – c->last_updated) * (1.0-c ->speed); pts_drift + time – (time – c->last_updated) * (1.0-c ->speed); Why do you write that? Pts_drift +time = (c-> PTS – c->last_updated)+time; pts_drift +time = (c-> PTS – c->last_updated)+time; Time_diff = time -c ->last_updated c-> PTS +time_diff+(c-> speed-1)* time_diff c->pts+c->speed* time_diff. The last time was c-> PTS, and time_diff has elapsed. The last time is c->speed*time_diff. For example: 10 seconds in real life, if you play it twice as fast, the video will have passed 20 seconds. So this expression is pretty clear. Set_clock is also called in set_clock_speed to ensure that the speed has not changed since the last update, otherwise the calculation is meaningless. This is about it, there is a point is in seek time synchronization clock processing, to seek problem to see again.

The processing of the seek

Seek is to adjust the progress bar to a new place to start broadcasting, this operation will disrupt the original data stream, some playback order must be re-established. Issues to be addressed include:

  • Buffer data is released, and it should all be released clean

  • Play time display

  • Maintenance of the “loading” state, which affects the presentation of the user interface

  • The problem of eliminating error frames

process

Ffp_notify_msg2 (mp->ffplayer, FFP_REQ_SEEK, (int)msec); When the message is captured, it is called to stream_seek, and seek_req is set to 1, recording the seek target to seek_pos. In the read_thread function, if is->seek_req is true, enter seek.

  • Ffp_toggle_buffering Turns decoding off and the packet buffer stops

  • Call avformat_seek_file to seek

  • After success, flush the buffer with packet_queue_flush and insert flush_pkt into it, marking the data

  • Record the current serial

The points worth learning here are:

  • When I process seek, I call ffmPEG’s seek method in another thread, and this is directly in the reader thread, so I don’t have to wait for the end of the read process

  • Flush buffers after seek succeeds

because

if (pkt == &flush_pkt)        q->serial++;Copy the code

So the meaning of serial is reflected, each seek,serial+1, that is,serial as a mark, the same on behalf of the same time seek.

To the decoder_decode_frame:

  • Since the changes to SEEK are made in the reader thread, not in the decoder thread, the changes to SEEK can occur anywhere in this code.

  • If (d->queue->serial == d-> serial) while (d->queue->serial! If (pkt.data == flush_pkt.data) {flush_pkt.data = true; false: flush_pkt.data == flush_pkt.data;

  • Packet_queue_get_or_buffering always reads into flush_pkt, so it must always flush into block 3. Avcodec_flush_buffers is executed to flush the cache of the decoder.

  • If seek is after block 2, then only block 4 will be entered, but if loop back, block 2 and block 3 will be entered, and avcodec_flush_buffers will be cleared by packet.

  • Combine the above two cases, only after seek packet will be decoded, awesome!

This passage is remarkable for:

  • Seek’s changes at any time, it can’t go wrong

  • Seek processing is done in the decoding thread, eliminating the conditional lock and other communication between threads processing, more simple and stable. If the entire stream is a river, the flush_pkt is like a buoy in the river. The color of the water behind the buoy changes. There is a sense of upgrading yourself, rather than having a third party do the auxiliary upgrading. This is better for pipelined program logic.

4. Play

Video_refresh:

   if(vp->serial ! = is->videoq.serial) { frame_queue_next(&is->pictq); goto retry; }Copy the code

Audio audio_decode_frame:

    do {       if(! (af = frame_queue_peek_readable(&is->sampq)))return- 1; frame_queue_next(&is->sampq); }while(af->serial ! = is->audioq.serial);Copy the code

The old data has been skipped according to serial. So overall, the most powerful thing in seek’s system is the use of serial to mark data, so that you can clearly know what is existing data and what is new data. Then the processing is done in the original thread processing, rather than in another thread to modify the relevant data, eliminating thread control, thread communication trouble operation, stability is also improved.

Playback time acquisition

Ijkmp_get_current_position,seek, return the time of seek, Get_master_clock minus the start time is-> IC ->start_time. There was a huge jump in content position when SEEKING, so how do I keep the clock correct?

PTS in audio and video data are frame-> PTS * AV_Q2D (TB), which is content time, but converted to real time units. Then is-> Audio_clock = AF -> PTS + (double) AF ->frame->nb_samples/af->frame->sample_rate; So is->audio_clock is the content time of the latest audio frame when the data is finished.

set_clock_at(&is->audclk, is->audio_clock - (double)(is->audio_write_buf_size) / is->audio_tgt.bytes_per_sec - SDL_AoutGetLatencySeconds(FFP ->aout), IS -> AUDIo_clock_serial, FFP -> Audio_callback_time / 1000000.0);Copy the code

Because IS ->audio_write_buf_size = IS ->audio_buf_size – is->audio_buf_index; (double)(is->audio_write_buf_size)/is-> Audio_tgt.bytes_per_sec)/is-> Audio_tgt.bytes_per_sec SDL_AoutGetLatencySeconds(FFP ->aout) is the time of the data in the upper buffer. For the iOS AudioQueue, there are multiple Audiobuffers waiting to play, and this is the time it takes for them to finish playing. The timeline looks like this: [frame end point][remaining BUF time][upper level BUF time][just finished playing point] so the time of the second parameter is: content time at the end of the current frame – time remaining BUF time – upper level player buF time, which is just finished playing content time. FFP ->audio_callback_time is the time when the fill method is called. There is an assumption that the upper-layer player calls the fill function as soon as it finishes playing a buffer, so FFP ->audio_callback_time is the actual time when the playback ends. The meanings of the second parameter and the fourth parameter match. Back to SEEK, after SEEK completes, there will be the first new frame to play, it will synchronize the clock PTS, i.e. media content time to the position after SEEK, then there is another question: Seek_req: set_clock_at = 0; seek_req: set_clock_at = 0; set_clock_at = 0; The progress bar should be flashing back to before SEEK. This is not the case, because get_clock has an if (*c->queue_serial! = c->serial) return NAN;

This serial is really god operation, too easy to use! C ->queue_serial is a pointer to init_clock(&is->vidclk, &is-> Videoq. Serial); c->queue_serial is a pointer to init_clock(&is->vidclk, &is-> Videoq. And the serial of packetQueue share memory. C ->queue_serial! = c->serial Seek_req = seek_req = seek_req = seek_req = seek_req = seek_req

Resource release during stop

From the method shutdown to the core release method stream_close. The operation flow is as follows:

1, Stop reading thread:

Packet_queue_abort abort the packetQueue. Abort_request is marked with 1, and SDL_WaitThread waits for the thread to terminate

Stream_component_close: stream_component_close: stream_component_close

  • Decoder_abort stops packetQueue, releases the block from Framequeue, waits for the decoder thread to terminate, and then empties the packetQueue.

  • Decoder_destroy Destroys the decoder

  • Reset stream data to null

3, stop the display thread: in the display thread there is a judgment data stream, video is-> video_ST, audio is->audio_st, in the previous step to reset the stream to empty, the display thread will end. Again, SDL_WaitThread is used to wait for the thread to terminate.

Packet_queue_destroy destroys packetQueue, and frame_queue_destory destroys frameQueue.

Compare what I wrote and what needs to be modified:

  • Threads are terminated using pthread_JOIN instead of locks

  • Decoder, buffer, etc., all destroyed, the next play to rebuild, do not reuse

  • Audio stops by stopping the upper layer of the player, the lower layer is passive, and there is no loop thread; Stopping the video just waits for the thread to end.

The core is the first point, which uses pthread_JOIN to wait for the thread to terminate.

Poor network handling

It will automatically pause and wait. Play or pause can be controlled internally. Architecture unification when using VTB

  • The frame buffer uses a custom data structure frame to unify styles.

  • The bottom layer has the Frame data, the top layer has Vout, the boundary is right here. And then the top layer wants overlay, so the question is how do you convert frame into overlay, and how do you display overlay. These two operations are done by the create_overlay and display_overlay provided by Vout.

  • With VTB, the data is stored in the pixelBuffer obtained after decoding, while the ffmPEG decoded data is stored in the AVFrame. The difference between the transformations is in the different overlay creation functions.

Conclusion:

  • For the connection of two modules, in order to unify, both sides need to encapsulate a unified model;

  • In the unified model, there are different operation subdivisions;

  • Input data goes from A to B, so the segmentation is provided by B, since B is the recipient and knows what result is required.

  • This is the same in the execution process, can maintain the stability of the process; The actual implementation is different in some places, and can be adapted to a variety of unique needs.

Original author: FindCrt, the original link: https://www.jianshu.com/p/814f3a0ee997



Welcome to follow my wechat public account “Code farming breakthrough”, share Python, Java, big data, machine learning, artificial intelligence and other technologies, pay attention to code farming technology improvement, career breakthrough, thinking transition, 200,000 + code farming growth charge first stop, accompany you have a dream to grow together.