First, the player frame

Commonly used audio and video terms

  • Conainer/File: multimedia files in specific formats, such as MP4, FLV, MKV, etc.
  • Stream: REPRESENTS a continuous piece of data on the timeline, such as a piece of sound data, a piece of video data, or a piece of subtitle data. The compressed data can be compressed or uncompressed, and the compressed data needs to be associated with a specific codec (some Stream audio is pure PCM).
  • Frame/Packet: Usually, a media stream is composed of a large number of data frames. For compressed data, frames correspond to the minimum processing unit of the codec. Data frames belonging to different media streams are interlaced and stored in a container.
  • Codec: A codec converts compressed data to and from raw data in frames.

2. Common concepts

  • multiplexer

  • codecs

FFmpeg library introduction

FFMPEG has eight common libraries:

  • AVUtil: The core tool library that many other modules rely on for basic audio and video processing operations.

  • AVFormat: File format and Protocol library. This module is one of the most important modules, encapsulating the Protocol layer and Demuxer, Muxer layer, making the Protocol and format transparent to developers.

  • AVCodec: FFmpeg does not add libraries like libx264, FDK-AAC, etc., by default, but FFmpeg is a platform that can add other third-party coDecs as plug-ins. Then provide a unified interface for developers.

  • AVFilter: Audio and video filter library, this module provides including audio effects and video effects processing, in the process of using FFmpeg API codec, direct use of this module for audio and video data processing is very convenient and very efficient.

  • AVDevice: Input/output device library. For example, if you want to compile the tool ffplay to play sound or video, you need to make sure that the module is open. You also need to pre-compile THE SDL because the device module plays sound and video using the SDL library.

  • SwrRessample: This module can be used for audio resampling. It can convert the number of channels, data format, sampling rate and other basic information of digital audio.

  • SWScale: This module is used for image format conversion. For example, YUV data can be converted to RGB data, and the scale size is changed from 1280720 to 800480.

  • PostProc: This module is used for post-processing and needs to be turned on when using AVFilter because some of its basic functions are used in Filter.

  • Av_register_all () : Registers all components, deprecated in 4.0

  • Avdevice_register_all () registers devices, such as V4L2, etc.

  • avformat_network_init(); Initialize network libraries and network encryption protocol-related libraries (such as OpenSSL)

Package format correlation

  • avformat_alloc_context(); Is responsible for applying memory to an AVFormatContext structure and simply initializing it
  • avformat_free_context(); Free everything in the structure and the structure itself
  • avformat_close_input(); Turn off the demultiplexer. When closed, you no longer need to use avformat_free_context to free.
  • avformat_open_input(); Open the input video file
  • Avformat_find_stream_info () : Gets audio and video file information
  • av_read_frame(); Read audio and video packages
  • avformat_seek_file(); Locate the file
  • Av_seek_frame (): locate the file

Decoder correlation

  • Avcodec_alloc_context3 (): Allocates the decoder context
  • Avcodec_find_decoder () : Finds decoders by ID
  • Avcodec_find_decoder_by_name (): according to the decoder name
  • Avcodec_open2 () : Opens the codec
  • Avcodec_decode_video2 () : Decodes a frame of video data
  • Avcodec_decode_audio4 () : Decodes a frame of audio data
  • Avcodec_send_packet (): sends coded packets
  • Avcodec_receive_frame (): indicates the data received after decoding
  • Avcodec_free_context (): Releases the decoder context, including avcodec_close()
  • Avcodec_close (): Closes the decoder

Ffmpeg3. x Component registration mode

To use FFMPEG, we first need to execute av_register_all to register the global decoder, encoder, and other structures into their global object lists for later lookup calls.

Ffmpeg4. x component registration mode

FFmpeg4.0.2 component registration mode

4. Brief introduction of FFmpeg common structures

  • AVFormatContext encapsulates the format context structure, which is also a global structure that holds information about the format of the video file.
  • AVInputFormat Demuxer Each package format (such as FLV, MKV, MP4, AVI) corresponds to one of this structure.
  • AVOutputFormat muxer
  • This structure corresponds to one for each video (audio) stream in an AVStream video file.
  • AVCodecContext A codec context structure that stores video (audio) codec information.
  • AVCodec Each video (audio) codec (such as the H.264 decoder) corresponds to one of these constructs.
  • AVPacket stores a frame of compressed encoded data.
  • AVFrame stores a frame of decoded pixel (sampled) data.

Relationships between FFmpeg data structures

Relationship between AVFormatContext and AVInputFormat

  • AVFormatContext API call
  • AVInputFormat is primarily an FFMPEG internal call
Struct AVInputFormat *iformat;Copy the code
Int (*read_header)(struct AVFormatContext *); int (*read_packet)(struct AVFormatContext *, AVPacket *pkt);Copy the code
int avformat_open_input(AVFormatContext **ps, const char *filename,AVInputFormat *fmt, AVDictionary **options)
Copy the code

Relationship between AVCodecContext and AVCodec

Struct AVCodec *codec;Copy the code
AVCodec Each video (audio) codec int (*decode)(AVCodecContext *, void * outData, int * outDATA_size, AVPacket * AVPKt); int (*encode2)(AVCodecContext *avctx, AVPacket *avpkt, const AVFrame *frame, int *got_packet_ptr);Copy the code

Distinguish between different codestreams

AVMEDIA_TYPE_VIDEO Video_index = AV_find_best_stream (IC, AVMEDIA_TYPE_VIDEO, -1,-1, NULL, 0)Copy the code
-avMediA_type_audio Audio stream Audio_index = AV_find_best_stream (IC, AVMEDIA_TYPE_AUDIO, -1,-1, NULL, 0)Copy the code

AVPacket also has an index field in it

Data structure analysis

AVFormatContext

  • Iformat: AVInputFormat of the input media. For example, point to AVInputFormat ff_FLv_demuxer
  • Nb_streams: indicates the number of avstreams in the input media
  • Streams: AVStream [] array of input media
  • Duration: Specifies the duration (in microseconds) of the input media. For details, see the av_dump_format() function.
  • Bit_rate: bit rate of the input media

AVInputFormat

  • Name: indicates the name of the encapsulation format
  • Extensions: Indicates the extension of the package format
  • Id: INDICATES the ID of the encapsulation format
  • Some interface functions that encapsulate format processing, such as read_packet()

AVStream

  • Index: indicates the video/audio stream
  • Time_base: The time base of the stream, PTS*time_base= the real time (in seconds)
  • Avg_frame_rate: Frame rate of the stream
  • Duration: indicates the length of the video/audio stream
  • Codecpar: Codec parameter property

AVCodecParameters

  • Codec_type: media type, such as AVMEDIA_TYPE_VIDEO AVMEDIA_TYPE_AUDIO
  • Codec_id: indicates the codec type, such as AV_CODEC_ID_H264 AV_CODEC_ID_AAC.

AVCodecContext

  • Codec: AVCodec of the codec, such as pointing to AVCodec ff_AAC_LATm_decoder
  • Width, height: the width and height of the image (for video only)
  • Pix_fmt: Pixel format (for video only)
  • Sample_rate: sampling rate (audio only)
  • -Leonard: The number of channels
  • Sample_fmt: Sampling format (audio only)

AVCodec

  • Name: codec name
  • Type: codec type
  • Id: indicates the ID of the codec
  • Some codec interface functions, such as int (*decode)()

AVCodecContext

  • Codec: AVCodec of the codec, such as pointing to AVCodec ff_AAC_LATm_decoder
  • Width, height: the width and height of the image (for video only)
  • Pix_fmt: Pixel format (for video only)
  • Sample_rate: sampling rate (audio only)
  • -Leonard: The number of channels
  • Sample_fmt: Sampling format (audio only)

AVCodec

  • Name: codec name
  • Type: codec type
  • Id: indicates the ID of the codec
  • Some codec interface functions, such as int (*decode)()

AVPacket

  • PTS: Displays the timestamp
  • DTS: timestamp of decoding
  • Data: compressed and encoded data
  • Size: indicates the size of compressed encoded data
  • Pos: indicates the offset address of data
  • Stream_index: AVStream to which it belongs

AVFrame

  • Data: decoded image pixel data (audio sampling data)
  • Linesize: For video, the size of a row of pixels in the image; For audio, it’s the size of the entire audio frame
  • Width, height: the width and height of the image (for video only)
  • Key_frame: specifies whether the frame is a key frame (for videos only).
  • Pict_type: frame type (for video only). For example, I, P, B
  • Sample_rate: Audio sampling rate (audio only)
  • Nb_samples: Number of audio samples per channel (audio only)
  • PTS: Displays the timestamp

Through ffMPEG learning, I feel very good, now there are free courses, recommended to everyone to learn: [C/C++Linux server development/background architect [zero sound education] [ke.qq.com/course/4177…]