In chapter 1, we downloaded and compiled the FFmpeg library. But I don’t think FFmpeg explains what it is.

introduce

FFmpeg is an open source computer program that can record, convert, and stream digital audio and video.

composition

.exe file (command line file) :

  • Ffmpeg: Used to convert video documents or audio files to formats.
  • Ffplay: A simple player, based on FFmpeg library (decoding) and SDL (display).
  • Ffprobe: displays information about media files.

Libraries:

  • Libavcodec: Contains all audio/video codec libraries
  • Libavdevice: Audio and video capture, acceleration, and display for hardware
  • Libavfilter: provides a generic audio and video filter framework. Avfilter can be used to do some effect processing on audio and video data, such as tone removal, blur, horizontal flip, cropping, add box, overlay text and other functions
  • Libavformat: encapsulator and unencapsulator containing multimedia container formats. Such as MP4, FLV, TS and other file packaging formats, RTMP, RTSP, HLS and other network protocol packaging formats.
  • Libavresample: a library for audio resampling, sample format conversion and blending
  • Libavutil: library of utility functions, including random number generators, data structures, math tools, core multimedia tools, and more.
  • Libpostproc: a library for post-processing videos
  • Libswresample: Provides audio resampling, sampling format conversion and blending
  • Libswscale: Highly optimized image scaling, image color space/pixel format conversion, such as RGB to YUV.

Libavcodec, libavFormat, libavutil are the most popular libraries, providing the most basic functions for audio and video development.

Note: I checked many articles while writing this article and found that no one wrote about the difference between libavresample and libswresample. I checked the official explanation, and it turns out the descriptions are the same. So let’s not go into the difference for now, but we’ll do it later.

Commonly used class

Before introducing, let’s review the audio and video processing process:

According to the functions, it can be divided into the following blocks:

  1. Protocol solution (HTTP, RTSP, RTMP, MMS)

    • AVIOContext: a structure that manages input and output data;
    • URLProtocol: describes the protocols used for audio and video data transmission. Each transmission protocol has a CORRESPONDING URLProtocol structure.
    • URLContext: Type and status of the protocol used to store audio and video.
  2. Unsealed (FLV, AVI, RMVB, MP4)

    • AVFormatContext: Mainly stores information contained in audio and video encapsulation formats; The overarching structure, throughout.
    • AVInputFormat: Encapsulation format used to store input audio and video. Each audio and video encapsulation format corresponds to an AVInputFormat structure.
    • AVOutputFormat: Encapsulation format used to store output audio and video. Each audio and video encapsulation format corresponds to an AVOutputFormat structure.
  3. Decoding (H264, MPEG2, AAC, MP3)

    • AVStream: Each stores data related to a video/audio stream
    • AVCodecContext: Stores the data related to the video/audio stream using the decoding method;
    • AVCodec: contains the video/audio decoders, each of which corresponds to an AVCodec structure;

    Each AVStream corresponds to an AVCodecContext, and each AVCodecContext corresponds to an AVCodec. The AVStream structure has a member that is the AVCodecContext.

  4. Save the data

    • AVPacket: compressed data after encoding and before decoding, that is, H264 bit stream data or AAC;
    • AVFrame: original data before and after encoding, such as YUV video data or PCM audio data, etc.

    For video, there’s usually one frame per structure; The audio may have several frames

The corresponding relationship is as follows:

Official technical documentation: ffmpeg.org/doxygen/tru…

The thor blog: blog.csdn.net/leixiaohua1… Other blog: www.cnblogs.com/renhui/p/69… www.cnblogs.com/linuxAndMcu…