An article to get you started on FFmpeg programming

1. Introduction

FFmpeg is a powerful audio and video processing library, but it is usually touched more in the form of commands. This article covers the use of FFmpeg related apis, especially its powerful filter library.

1.1 What can YOU Learn

Android integration with FFmpeg
Decode the audio using the AVCodec library
Use avfilter to change speed, tune and mix audio
C/C++ multi-threaded programming, producer/consumer implementation
Audio playback through OpenSL ES in NDK
Control audio playback in NDK

1.2 What is implemented

The main materials of this project are five Hundred Mile guitar, ukulele, drums and other four tracks. Realize multi-track real-time playback, multi-track volume adjustment, variable speed playback, progress adjustment and other functions

1.3 Project Address

Github.com/iamyours/FF…

2. FFmpeg dynamic library compilation

2.1 Downloading NDK and FFmpeg

Android default download version of the NDK Studio will appear some compatibility problems, so we are here to use the NDK – r15c (win64 | on | mac64) version. FFmpeg official website download source code, I used 3.2.12

2.2 Decompressing Files

First unzip the NDK and FFMPEG

Tar -zxf ffmPEG-3.2.12.tar. gz unzip android-ndk-r15c-Darwin x86_64.zip-d android-ndk-r15c
Copy the code

2.3 Modify FFmpeg configuration for Android

Go to the FFmpeg directory and modify the configure file

SLIBNAME_WITH_MAJOR='$(SLIBNAME).$(LIBMAJOR)'
LIB_INSTALL_EXTRA_CMD='? (RANLIB)"$(LIBDIR)/$(LIBNAME)"'
SLIB_INSTALL_NAME='$(SLIBNAME_WITH_VERSION)'
SLIB_INSTALL_LINKS='$(SLIBNAME_WITH_MAJOR)$(SLIBNAME)'
Copy the code

Replace with

SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)'
LIB_INSTALL_EXTRA_CMD='? (RANLIB)"$(LIBDIR)/$(LIBNAME)"'
SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)'
SLIB_INSTALL_LINKS='$(SLIBNAME)'
Copy the code

2.4 Write FFmpeg script to generate dynamic SO library

Create the build_android.sh script

#! /bin/sh
NDK=/Users/xxx/Desktop/soft/android-ndk-r15c
SYSROOT=$NDK/platforms/android-21/arch-arm
TOOLCHAIN=$NDKToolchains/arm - Linux - androideabi - 4.9 / prebuilt Darwin - x86_64function build_one
{
./configure \
--prefix=$PREFIX \
--enable-shared \
--disable-static \
--disable-doc \
--disable-ffmpeg \
--disable-ffplay \
--disable-ffprobe \
--disable-ffserver \
--disable-avdevice \
--disable-doc \
--disable-symver \
--cross-prefix=$TOOLCHAIN/bin/arm-linux-androideabi- \
--target-os=linux \
--arch=arm \
--enable-cross-compile \
--sysroot=$SYSROOT \
--extra-cflags="-Os -fpic $ADDI_CFLAGS" \
--extra-ldflags="$ADDI_LDFLAGS" \
$ADDITIONAL_CONFIGURE_FLAG
make clean
make
make install
}
CPU=arm
PREFIX=$(pwd)/android/$CPU
ADDI_CFLAGS="-marm"
build_one
Copy the code

Add the execute permission and run the sh script

chmod +x build_android.sh
./build_android.sh
Copy the code

The entire compilation took about 10 minutes (MBP I5 configuration), and when it was complete, you could see the relevant SO files and header files in the Android directory

3. Add FFmpeg to the Android project

3.1 create an Android project and add C++ support

Open Android Studio, create a new project FFmpegAudioPlayer, and add C++ support

3.2 Configuring FFmpeg Dynamic Library

Create the jniLibs folder in the main file under SRC, create the armeabi folder in jniLibs, Libavcodec-57.so /libavfilter-6.so/libavformat-57.so/libavutil-55.so/ libswresampl-2.so /libavutil-55.so/ libswresampl-2.so/libavcodec-57.so/libavfilter-6.so/libavformat-57.so/libavutil-55.so/ libswresampl-2.so / Libswscale-4.so) copy to this directory. Copy the entire android/arm/include directory to jniLibs. The final directory is as follows

android {
    ...
    defaultConfig {
        ...
        externalNativeBuild {
            ndk{
                abiFilters "armeabi"}}}... }Copy the code

Open the cmakelists. TXT file in the app directory and modify the configuration as follows

cmake_minimum_required(VERSION 3.4.1)
add_library( native-lib
             SHARED
             src/main/cpp/native-lib.cpp)
find_library( log-lib
              log )
find_library( android-lib
              android )
set(distribution_DIR ${CMAKE_SOURCE_DIR}/src/main/jniLibs/${ANDROID_ABI})


add_library( avutil-55
             SHARED
             IMPORTED )
set_target_properties( avutil-55
                       PROPERTIES IMPORTED_LOCATION
                       ${distribution_DIR}/libavutil-55.so)

...
Add_library,set_target_properties
Swresample-2, AVCODEC-57, AVfilter-6, swscale-4AVformatt-57.set(CMAKE_VERBOSE_MAKEFILE on)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=gnu++11")
include_directories(src/main/cpp)
include_directories(src/main/jniLibs/include)

target_link_libraries(native-lib
                      avutil-55       # tool library
                      swresample-2    Audio sampling data format conversion
                      avcodec-57      # codec
                      avfilter-6      # Filter special effects processing
                      swscale-4       # Video pixel data format conversion
                      avformat-57     # Encapsulate format processing
                      OpenSLES
                      ${log-lib}
                      ${android-lib})
Copy the code

After the configuration is completed, we compile and run it once. If it can be successfully installed on the mobile phone and run normally, the configuration is correct.

4. Decode mp3 to PCM

The first power of FFmpeg is its codec capability. It can decode any audio format (MP3, WAV, AAC, OGG, etc.) and video format (MP4, AVI, RM, RMVB, MOV, etc.) on the market. The audio and video are decoded into avframes by decoder, and each frame contains PCM information of audio or YUV information of video. With encoders, FFmpeg can encode frame into audio and video files in different formats. So we can use FFmpeg very simple format conversion, and do not need to understand the relevant protocols of various formats.

4.1 Decoding Process

In order to be able to decode mp3 files, you need to read the audio information through FFMPEG, then get the corresponding decoder, then loop through each frame of audio data, and decode through the decoder. The general decoding process is as follows:

4.2 Complete Code

Introduce the so library in mainActivity.kt

class MainActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?). {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
    }

    fun decodeAudio(v: View) {
        val src = "${Environment.getExternalStorageDirectory()}/test1.mp3"
        val out = "${Environment.getExternalStorageDirectory()}/out.pcm"
        decodeAudio(src, out)
    }

    external fun decodeAudio(src: String.out: String)
    companion object {
        init {
            System.loadLibrary("avutil-55")
            System.loadLibrary("swresample-2")
            System.loadLibrary("avcodec-57")
            System.loadLibrary("avfilter-6")
            System.loadLibrary("swscale-4")
            System.loadLibrary("avformat-57")
            System.loadLibrary("native-lib")}}}Copy the code

Write audio decoding code in native-lib.cpp

#include <jni.h>
#include <android/log.h>
#include <string>

extern "C" {
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
#include <libswresample/swresample.h>
}
#define LOGI(FORMAT, ...) __android_log_print(ANDROID_LOG_INFO,"FFmpegAudioPlayer",FORMAT,##__VA_ARGS__);
#define LOGE(FORMAT, ...) __android_log_print(ANDROID_LOG_ERROR,"FFmpegAudioPlayer",FORMAT,##__VA_ARGS__);
extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_MainActivity_decodeAudio(
        JNIEnv *env,
        jobject /* this */, jstring _src, jstring _out) {
    const char *src = env->GetStringUTFChars(_src, 0);
    const char *out = env->GetStringUTFChars(_out, 0);

    av_register_all();// Register all container decoders
    AVFormatContext *fmt_ctx = avformat_alloc_context();

    if (avformat_open_input(&fmt_ctx, src, NULL.NULL) < 0) {// Open the file
        LOGE("open file error");
        return;
    }
    if (avformat_find_stream_info(fmt_ctx, NULL) < 0) {// Read file information in audio format
        LOGE("find stream info error");
        return;
    }
    // Get the audio index
    int audio_stream_index = - 1;
    for (int i = 0; i < fmt_ctx->nb_streams; i++) {
        if (fmt_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
            audio_stream_index = i;
            LOGI("find audio stream index");
            break; }}// Get the decoder
    AVCodecContext *codec_ctx = avcodec_alloc_context3(NULL);
    avcodec_parameters_to_context(codec_ctx, fmt_ctx->streams[audio_stream_index]->codecpar);
    AVCodec *codec = avcodec_find_decoder(codec_ctx->codec_id);
    // Open the decoder
    if (avcodec_open2(codec_ctx, codec, NULL) < 0) {
        LOGE("could not open codec");
        return;
    }
    // Allocate AVPacket and AVFrame memory to receive audio data and decode data
    AVPacket *packet = av_packet_alloc();
    AVFrame *frame = av_frame_alloc();
    int got_frame;// Receive the result of decoding
    int index = 0;
    // PCM output file
    FILE *out_file = fopen(out, "wb");
    while (av_read_frame(fmt_ctx, packet) == 0) {// Read audio data into packet
        if (packet->stream_index == audio_stream_index) {// Fetch audio index packet
            if (avcodec_decode_audio4(codec_ctx, frame, &got_frame, packet) <
                0) {// Decode the packet into AVFrame
                LOGE("decode error:%d", index);
                break;
            }
            if (got_frame > 0) {
                LOGI("decode frame:%d", index++);
                fwrite(frame->data[0].1.static_cast<size_t>(frame->linesize[0]),
                       out_file); // Want to write single channel PCM data to a file

            }
        }
    }
    LOGI("decode finish...");
    // Release resources
    av_packet_unref(packet);
    av_frame_free(&frame);
    avcodec_close(codec_ctx);
    avformat_close_input(&fmt_ctx);
    fclose(out_file);
}
Copy the code

Pay attention to add file permissions, put the test audio test1.mp3 into the SD card of the mobile phone, click the decode button, after completion, we can see the PCM file, you can open Audition (Xp can be installed through Parallels Desktop under MAC, fusion mode is not too easy to use). Select 48000Hz, 1 channel (only one channel is written), open, you can view and play PCM file through Audition.

5. Enter a single AVFilter

Another powerful feature of FFmpeg is that it implements a variety of filters, which can produce audio and video into different effects. Video can be clipped, scaled, rotated, merged, added watermarking and other effects. Audio can be de-noising, echo, delay, mixing, speed change and other effects. The output of one filter can be the input of another filter. Using filter in combination, we can customize the audio and video effects we want. The API usage of audio filter is divided into two sections, one is a single input volume(volume adjustment),atempo(speed change).

5.1 Single-input Audio Filtering Process

After decoding the audio, avFilter API can be used for effect processing of the AVFrame decoded, such as volume adjustment and speed change processing. Multiple audio inputs can also be mixed (see 6.1) for a single-input filter decoding process

AVFrame -> Abuffer -> Other filters (Volume)... ->aformat->abuffersink-> Filtered AVFrameCopy the code

Here there are three general filter, abuffer, aformat, abuffersink. Abuffer is used to receive input frame and form data cache to be processed; Abuffersink is used to outgoing output frame; aformat filter is used to constrain the final output format (sampling rate, number of channels, storage bits, etc.), which are indispensable. Other filters in the middle can be connected in series with multiple filters, such as volume and Atempo

5.2 Filter Initialization

There are three important structures that we need to know about AVFilterGraph, AVFilterContext, AVFilter

5.3 Filter initialization code

Value is used as the volume adjustment parameter. The specific code is as follows

int init_volume_filter(AVFilterGraph **pGraph, AVFilterContext **src, AVFilterContext **out,
                       char *value) {

    // Initialize AVFilterGraph
    AVFilterGraph *graph = avfilter_graph_alloc();
    // Get the abuffer used to receive input
    AVFilter *abuffer = avfilter_get_by_name("abuffer");
    AVFilterContext *abuffer_ctx = avfilter_graph_alloc_filter(graph, abuffer, "src");
    // Set parameters, here need to match the original audio sampling rate, data format (bits)
    if (avfilter_init_str(abuffer_ctx, "sample_rate=48000:sample_fmt=s16p:channel_layout=stereo") <
        0) {
        LOGE("error init abuffer filter");
        return - 1;
    }
    // Initialize the volume filter
    AVFilter *volume = avfilter_get_by_name("volume");
    AVFilterContext *volume_ctx = avfilter_graph_alloc_filter(graph, volume, "volume");
    // the av_dict_set parameter is used here
    AVDictionary *args = NULL;
    av_dict_set(&args, "volume", value, 0);// External parameters are passed here, which can be dynamically modified
    if (avfilter_init_dict(volume_ctx, &args) < 0) {
        LOGE("error init volume filter");
        return - 1;
    }

    AVFilter *aformat = avfilter_get_by_name("aformat");
    AVFilterContext *aformat_ctx = avfilter_graph_alloc_filter(graph, aformat, "aformat");
    if (avfilter_init_str(aformat_ctx,
                          "sample_rates=48000:sample_fmts=s16p:channel_layouts=stereo") < 0) {
        LOGE("error init aformat filter");
        return - 1;
    }
    // Initialize sink for output
    AVFilter *sink = avfilter_get_by_name("abuffersink");
    AVFilterContext *sink_ctx = avfilter_graph_alloc_filter(graph, sink, "sink");
    if (avfilter_init_str(sink_ctx, NULL) < 0) {// No arguments required
        LOGE("error init sink filter");
        return - 1;
    }
    // Link each filter context
    if (avfilter_link(abuffer_ctx, 0, volume_ctx, 0) != 0) {
        LOGE("error link to volume filter");
        return - 1;
    }
    if (avfilter_link(volume_ctx, 0, aformat_ctx, 0) != 0) {
        LOGE("error link to aformat filter");
        return - 1;
    }
    if (avfilter_link(aformat_ctx, 0, sink_ctx, 0) != 0) {
        LOGE("error link to sink filter");
        return - 1;
    }
    if (avfilter_graph_config(graph, NULL) < 0) {
        LOGI("error config filter graph");
        return - 1;
    }
    *pGraph = graph;
    *src = abuffer_ctx;
    *out = sink_ctx;
    LOGI("init filter success...");
    return 0;
}
Copy the code

5.4 Use filters to simulate real-time volume adjustment

Once the filter is initialized, you can use the filter to process the audio after decoding. To use this method, add the decoded AVFrame to the input filter context abuffer_ctx via av_bufferSRC_add_frame (abuffer_ctx,frame). Get the processed frame by av_bufferSINk_get_frame (sink_ctx,frame). Here the filter is modified once per 1000 audio frames to simulate real-time volume adjustment. The following code

    AVFilterGraph *graph;
    AVFilterContext *in_ctx;
    AVFilterContext *out_ctx;
    // Register all filters
    avfilter_register_all();
    init_volume_filter(&graph, &in_ctx, &out_ctx, "0.5");
    / / initialization
    while (av_read_frame(fmt_ctx, packet) == 0) {// Read audio data into packet
        if (packet->stream_index == audio_stream_index) {// Fetch audio index packet. Decoding the audioif (got_frame > 0) {
                LOGI("decode frame:%d", index++);
               if (index == 1000) {// Simulate dynamic volume modification
                    init_volume_filter(&graph, &in_ctx, &out_ctx, "0.01");
                }
                if (index == 2000) {
                    init_volume_filter(&graph, &in_ctx, &out_ctx, "1.0");
                }
                if (index == 3000) {
                    init_volume_filter(&graph, &in_ctx, &out_ctx, "0.01");
                }
                if (index == 4000) {
                    init_volume_filter(&graph, &in_ctx, &out_ctx, "1.0");
                }
                if (av_buffersrc_add_frame(in_ctx, frame) < 0) {// Put frame into the input filter context
                    LOGE("error add frame");
                    break;
                }
                while (av_buffersink_get_frame(out_ctx, frame) >= 0) {// Get the frame from the output filter context
                    fwrite(frame->data[0].1.static_cast<size_t>(frame->linesize[0]),
                           out_file); // Want to write single channel PCM data to a file}}}}Copy the code

Finally decoded PCM and original MP3 waveform comparison

5.5 Resampling using sWR_convert

When playing the audio, you can hear some noise and need sWR_convert to re-sample to get the full PCM data.

    // Initialize SwrContext
    SwrContext *swr_ctx = swr_alloc();
    enum AVSampleFormat in_sample = codec_ctx->sample_fmt;
    enum AVSampleFormat out_sample = AV_SAMPLE_FMT_S16;
    int inSampleRate = codec_ctx->sample_rate;
    int outSampleRate = inSampleRate;
    uint64_t in_ch_layout = codec_ctx->channel_layout;
    uint64_t outChannelLayout = AV_CH_LAYOUT_STEREO;
    swr_alloc_set_opts(swr_ctx, outChannelLayout, out_sample, outSampleRate, in_ch_layout, in_sample,
                       inSampleRate, 0.NULL);
    swr_init(swr_ctx);
    int out_ch_layout_nb = av_get_channel_layout_nb_channels(out_ch_layout);// Number of channels
    uint8_t *out_buffer = (uint8_t *) av_malloc(MAX_AUDIO_SIZE);// resampling data
Copy the code

Before writing PCM data, resample it with sWR_convert

 while (av_buffersink_get_frame(out_ctx, frame) >= 0) {// Get the frame from the output filter context
// fwrite(frame->data[0], 1, static_cast
      
       (frame->linesize[0]),
      
// out_file); // Want to write single channel PCM data to a file
  swr_convert(swr_ctx, &out_buffer, MAX_AUDIO_SIZE,
                                (const uint8_t **) frame->data, frame->nb_samples);
  int out_size = av_samples_get_buffer_size(NULL,out_ch_layout_nb,frame->nb_samples,out_sample_fmt,0);
                    fwrite(out_buffer,1,out_size,out_file);
}
Copy the code

This time we are writing the full 2 channels of data, and there is no noise.

5.6 Using the Atempo filter to achieve variable speed tune-invariant

Change the value of volumeFilter to atempo and set the parameter to tempo

AVFilter *volume = avfilter_get_by_name("atempo");
    AVFilterContext *volume_ctx = avfilter_graph_alloc_filter(graph, volume, "atempo"); // AVDictionary *args = NULL; av_dict_set(&args,"tempo", value, 0); // Adjust the volume to half the original volumeif (avfilter_init_dict(volume_ctx, &args) < 0) {
        LOGE("error init volume filter");
        return- 1; }Copy the code

The simulation dynamic modification speed is changed to the following when decoding

if (index == 1000) {// Simulate dynamic volume modification
    init_volume_filter(&graph, &in_ctx, &out_ctx, "1.0");
}
if (index == 2000) {
    init_volume_filter(&graph, &in_ctx, &out_ctx, "0.8");
}
if (index == 3000) {
    init_volume_filter(&graph, &in_ctx, &out_ctx, "1.5");
}
if (index == 4000) {
    init_volume_filter(&graph, &in_ctx, &out_ctx, "2.0");
}
Copy the code

After success, you can have a different speed of audio, use Audition open, select 48000,2 channels to play, you can hear it first in accordance with 0.5,1.0,0.8,1.5,2.0 broadcast, and the tone remains the same, not because of the speed of change and become higher or lower.

6. Enter AVFilter

Another scenario where FFmpeg uses the filter is to process multiple input data, such as adding watermarks to videos, adding captions, merging audio and video, etc. These scenarios require two or more inputs. This video is on AMix, which can mix multiple audio sounds.

6.1 Troubleshooting Flow for Multiple Filter Inputs

Enter AVFrame1 -> abuffer -> amix -> aformat -> Abuffersink -> Output AVFrame Enter AVFrame2 -> abufferCopy the code

The process is much the same as for a single-input filter, except that it receives multiple inputs. Therefore, multiple Filter contexts are required as inputs.

6.2 AmIX Filter Initialization

// initialize amix filter int init_amix_filter(AVFilterGraph **pGraph, AVFilterContext ** SRCS, AVFilterContext **pOut, jsize len) { AVFilterGraph *graph = avfilter_graph_alloc();for (int i = 0; i < len; i++) {
        AVFilter *filter = avfilter_get_by_name("abuffer");
        char name[50];
        snprintf(name, sizeof(name), "src%d", i);
        AVFilterContext *abuffer_ctx = avfilter_graph_alloc_filter(graph, filter, name);
        if (avfilter_init_str(abuffer_ctx,
                              "sample_rate=48000:sample_fmt=s16p:channel_layout=stereo") < 0) {
            LOGE("error init abuffer filter");
            return- 1; } srcs[i] = abuffer_ctx; } AVFilter *amix = avfilter_get_by_name("amix");
    AVFilterContext *amix_ctx = avfilter_graph_alloc_filter(graph, amix, "amix");
    char args[128];
    snprintf(args, sizeof(args), "inputs=%d:duration=first:dropout_transition=3", len);
    if (avfilter_init_str(amix_ctx, args) < 0) {
        LOGE("error init amix filter");
        return- 1; } AVFilter *aformat = avfilter_get_by_name("aformat");
    AVFilterContext *aformat_ctx = avfilter_graph_alloc_filter(graph, aformat, "aformat");
    if (avfilter_init_str(aformat_ctx,
                          "sample_rates=48000:sample_fmts=s16p:channel_layouts=stereo") < 0) {
        LOGE("error init aformat filter");
        return- 1; } AVFilter *sink = avfilter_get_by_name("abuffersink");
    AVFilterContext *sink_ctx = avfilter_graph_alloc_filter(graph, sink, "sink");
    avfilter_init_str(sink_ctx, NULL);
    for (int i = 0; i < len; i++) {
        if (avfilter_link(srcs[i], 0, amix_ctx, i) < 0) {
            LOGE("error link to amix");
            return -1;
        }
    }
    if (avfilter_link(amix_ctx, 0, aformat_ctx, 0) < 0) {
        LOGE("error link to aformat");
        return- 1; }if (avfilter_link(aformat_ctx, 0, sink_ctx, 0) < 0) {
        LOGE("error link to sink");
        return- 1; }if (avfilter_graph_config(graph, NULL) < 0) {
        LOGE("error config graph");
        return- 1; } *pGraph = graph; *pOut = sink_ctx;return 0;
}
Copy the code

Here the input AVFilterContex is held in an array, and each input is linked to an Amix filter through a loop so that multiple inputs can be received.

6.3 Use AMIX to achieve multi-track synthesis

To be able to pass in multiple audio data, we need to decode multiple audio files at the same time, so in the Java layer, we pass in an array of strings.

external fun mixAudio(arr: Array<String>,out:String)
Copy the code

val path = "${Environment.getExternalStorageDirectory()}/test"
val paths = arrayOf(
                    "$path/a.mp3"."$path/b.mp3"."$path/c.mp3"."$path/d.mp3"
)
mixAudio(paths,"$path/mix.pcm")
Copy the code

Decode each file using multiple decoders at the JNI layer

extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_MainActivity_mixAudio(
        JNIEnv *env,
        jobject /* this */, jobjectArray _srcs, jstring _out) {
    // Convert an array of strings passed in by Java to an array of C strings
    jsize len = env->GetArrayLength(_srcs);
    const char *out_path = env->GetStringUTFChars(_out, 0);
    char **pathArr = (char* *)malloc(len * sizeof(char *));
    int i = 0;
    for (i = 0; i < len; i++) {
        jstring str = static_cast<jstring>(env->GetObjectArrayElement(_srcs, i));
        pathArr[i] = const_cast<char *>(env->GetStringUTFChars(str, 0));
    }
    // Initialize the decoder array
    av_register_all();
    AVFormatContext **fmt_ctx_arr = (AVFormatContext **) malloc(len * sizeof(AVFormatContext *));
    AVCodecContext **codec_ctx_arr = (AVCodecContext **) malloc(len * sizeof(AVCodecContext *));
    int stream_index_arr[len];
    for (int n = 0; n < len; n++) { AVFormatContext *fmt_ctx = avformat_alloc_context(); fmt_ctx_arr[n] = fmt_ctx; .// Open each file in turn, get the audio index, get each decoder. AVCodecContext *codec_ctx = avcodec_alloc_context3(NULL); codec_ctx_arr[n] = codec_ctx; . }// Initialize SwrContextSwrContext *swr_ctx = swr_alloc(); .// Set the swr_ctx parameter. swr_init(swr_ctx);// Initialize the amix filter. init_amix_filter(&graph, srcs, &sink, len);// Start decoding
    FILE *out_file = fopen(out_path, "wb");
    AVFrame *frame = av_frame_alloc();
    AVPacket *packet = av_packet_alloc();
    int ret = 0, got_frame;
    int index = 0;
    while (1) {
        for (int i = 0; i < len; i++) {
            ret = av_read_frame(fmt_ctx_arr[i], packet);
            if (ret < 0)break;
            if (packet->stream_index == stream_index_arr[i]) {
                ret = avcodec_decode_audio4(codec_ctx_arr[i], frame, &got_frame, packet);// Decode the audio
                if (ret < 0)break;
                if (got_frame > 0) {
                    ret = av_buffersrc_add_frame(srcs[i], frame);// Add the decoded AVFrame to the amix input
                    if (ret < 0) {
                        break; }}}}while (av_buffersink_get_frame(sink, frame) >= 0) {// Get the processed AVFrame from sink output
            swr_convert(swr_ctx, &out_buffer, MAX_AUDIO_SIZE, (const uint8_t **) frame->data,
                        frame->nb_samples);
            int out_size = av_samples_get_buffer_size(NULL, out_ch_layout_nb, frame->nb_samples,
                                                      out_sample_fmt, 0);
            fwrite(out_buffer, 1, out_size, out_file);
        }
        if (ret < 0) {
            break;
        }
        LOGI("decode frame :%d", index);
        index++;
    }
    LOGI("finish");
}
Copy the code

Open the output file mix. PCM using Audition and you can hear the audio after the four files are mixed. Specific audio in assets directory, you can compare the effect

7. Play audio using OpenSLES

To be able to play PCM audio on Android, we used the OpenSLES library. Add OpenSLES to cmKE target_link_libraries, add header

7.1 OpenSLES Player Process

7.1.1. Create and implement engine objects

SLObjectItf engineObject;
slCreateEngine(&engineObject, 0.NULL.0.NULL.NULL);
 (*engineObject)->Realize(engineObject, SL_BOOLEAN_FALSE);
Copy the code

7.1.2. Obtaining the engine interface

SLEngineItf engineItf;
 (*enginObject)->GetInterface(engineObject,SL_IID_ENGINE,&engineItf);
Copy the code

7.1.3. Create and implement an output mixer object

SLObjectItf mixObject;
(*engineItf)->CreateOutputMix(engineItf, &mixObject, 0.0.0);
Copy the code

7.1.4. Set player parameters and create an initialization player object

SLDataLocator_AndroidSimpleBufferQueue android_queue = {SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 2}; // SLDataFormat_PCM PCM = {SL_DATAFORMAT_PCM, 2, sl_samplinGrateful 48,//48000 sample rate SL_PCMSAMPLEFORMAT_FIXED_16, SL_PCMSAMPLEFORMAT_FIXED_16, SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,// SL_BYTEORDER_LITTLEENDIAN}; SLDataSource slDataSource = {&android_queue, &pcm}; SLDataLocator_OutputMix outputMix = {SL_DATALOCATOR_OUTPUTMIX, mixObject}; SLDataLocator_OutputMix; SLDataLocator_OutputMix; SLDataSink audioSnk = {&outputMix, NULL}; const SLInterfaceID ids[3] = {SL_IID_BUFFERQUEUE, SL_IID_EFFECTSEND, SL_IID_VOLUME}; const SLboolean req[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE}; SLObjectItf playerObject; // CreateAudioPlayer(engineItf, &playerObject, &sldatasource, &audiosNK,1,ids,req); (*playerObject)->Realize(playerObject,SL_BOOLEAN_FALSE);Copy the code

7.1.5. Obtain the relevant interface from the player object

// Get the playback interface
SLPlayItf playItf;
(*playerObject)->GetInterface(playerObject, SL_IID_PLAY, &playItf);
// Get the buffer interface
SLBufferQueueItf bufferQueueItf;
(*playerObject)->GetInterface(playerObject, SL_IID_BUFFERQUEUE, &bufferQueueItf);
Copy the code

7.1.6. Register callback buffer, set playback state, and invoke callback function

// Register buffer callback
(*bufferQueueItf)->RegisterCallback(bufferQueueItf, playCallback, NULL);
// Set the playback state
(*playItf)->SetPlayState(playItf, SL_PLAYSTATE_PLAYING);
playCallback(bufferQueueItf, NULL);
Copy the code

The specific callback is as follows, and getPCM will be implemented later

Void playCallback (SLAndroidSimpleBufferQueueItf bq, void * args) {/ / obtain PCM data uint8_t * data; int size = getPCM(&data);if(size > 0) { (*bq)->Enqueue(bq, data, size); }}Copy the code

7.2 Multithreading decoding plays audio

In order to obtain PCM data, we use multithreading audio decoding, through condition variables, to achieve a producer consumer model, the decoding process is the production process, callback playback is the consumption process. Add the decoded AVFrame to the vector queue, then fetch the AVFrame when playing back and convert it to PCM data using sWR_convert.

7.2.1. Initialize synchronization lock, condition variable, and start decoding thread

Declare global variables

static pthread_mutex_t mutex;
// Condition variable
static pthread_cond_t notfull; // The queue did not reach the maximum buffer capacity
static pthread_cond_t notempty;// The queue is not empty
Copy the code

Initialize the synchronization lock and condition variables and start the decoding thread (before creating the player)

// Initialize synchronization locks and condition variables
pthread_mutex_init(&mutex, NULL);
pthread_cond_init(&notfull, NULL);
pthread_cond_init(&notempty, NULL);

// Initialize the code thread
pthread_t pid;
char *path = (char *) env->GetStringUTFChars(_path, 0);
pthread_create(&pid, NULL, decodeAudio, path);
Copy the code

7.2.2. Decode the audio and add AVFrame to the vector queue

Declare global variables

static std: :vector<AVFrame *> queue;
static SwrContext *swr_ctx;
static int out_ch_layout_nb;
static enum AVSampleFormat out_sample_fmt;
#define QUEUE_SIZE 5
#define MAX_AUDIO_SIZE 48000*4
Copy the code

Decoding the audio is similar to [section 4], except that the decoded AVFrame is queued.

void *decodeAudio(void *args) {
    // Open file, get initialization context, decoder, allocate packet/frame memory.while (av_read_frame(fmt_ctx, packet) == 0) {// Read audio data into packet
        if (packet->stream_index == audio_stream_index) {// Fetch audio index packet
            if (avcodec_decode_audio4(codec_ctx, frame, &got_frame, packet) <
                0) {// Decode the packet into AVFrame
                LOGE("decode error:%d", index);
                break;
            }
            if (got_frame > 0) {
                LOGI("decode frame:%d", index++); addFrame(frame); }}}// Release resources. }Copy the code

To ensure real-time audio playback, the number of avframes in the queue should not be too large. In later sections, we will Filter avFrames through filters before they are queued. Therefore, if the maximum buffer size is exceeded in the addFrame method, the pthread_cond_wait will block and wait for consumption, as follows:

void addFrame(AVFrame *src) {
    AVFrame *frame = av_frame_alloc();
    if (av_frame_ref(frame, src) >= 0) {/ / copy frame
        pthread_mutex_lock(&mutex);
        if (queue.size() == QUEUE_SIZE) {
            LOGI("wait for add frame... %d".queue.size());
            pthread_cond_wait(&notfull, &mutex);// The wait queue is not full
        }
        queue.push_back(frame);
        pthread_cond_signal(&notempty);// Send a non-null signalpthread_mutex_unlock(&mutex); }}Copy the code

7.2.3. Get PCM data and play PCM through openSLES callback function

We can consume avFrames added to the queue by registering buffer callbacks. The first thing to do is have a getFrame method

AVFrame *getFrame(a) {
    pthread_mutex_lock(&mutex);
    while (true) {
        if (!queue.empty()) {
            AVFrame *out = av_frame_alloc();
            AVFrame *src = queue.front();
            if (av_frame_ref(out, src) < 0)break;
            queue.erase(queue.begin());// Delete elements
            av_free(src);
            if (queue.size() < QUEUE_SIZE)pthread_cond_signal(&notfull);// Send the notFull signal
            pthread_mutex_unlock(&mutex);
            return out;
        } else {// Empty to be added
            LOGI("wait for get frame");
            pthread_cond_wait(&notempty, &mutex);
        }
    }
    pthread_mutex_unlock(&mutex);
    return NULL;
}
Copy the code

Then we implement the original getPCM method as follows:

int getPCM(uint8_t **out) {
    AVFrame *frame = getFrame();
    if (frame) {
        uint8_t *data = (uint8_t *) av_malloc(MAX_AUDIO_SIZE);
        swr_convert(swr_ctx, &data, MAX_AUDIO_SIZE, (const uint8_t **) frame->data,
                    frame->nb_samples);
        int out_size = av_samples_get_buffer_size(NULL, out_ch_layout_nb, frame->nb_samples,
                                                  out_sample_fmt, 0);
        *out = data;
        return out_size;
    }
    return 0;
}
Copy the code

Swr_convert converts the AVFrame data into a Uint8_t array, which is then used to buffer Enqueue playback in the queue interface.

8. FFmpeg player implementation

With that in mind, you’re ready to build an FFmpeg audio player. Main requirements, multiple audio mix playback, volume control for each track, synthetic audio speed playback.

8.1 AudioPlayer class

First we create a C++ Class named AudioPlayer, in order to implement audio decoding, filtering, queue, output PCM related, multithreading, Open SL ES related member variables, code as follows:

/ / decoding
int fileCount;                  // Enter the number of audio files
AVFormatContext **fmt_ctx_arr;  //FFmpeg context array
AVCodecContext **codec_ctx_arr; // Decoder context array
int *stream_index_arr;          // Audio stream index array
/ / filter
AVFilterGraph *graph;
AVFilterContext **srcs;         / / input filter
AVFilterContext *sink;          / / output filter
char **volumes;                 // The volume of each audio
char *tempo;                    // The playback speed is 0.5~2.0

/ / AVFrame queue
std: :vector<AVFrame *> queue;   // Queue is used to store the AVFrame after decoding and filtering

// Input/output format
SwrContext *swr_ctx;            // Resampling to convert AVFrame to PCM data
uint64_t in_ch_layout;
int in_sample_rate;            / / sampling rate
int in_ch_layout_nb;           // Enter the number of channels to work with swr_CTx
enum AVSampleFormat in_sample_fmt; // Enter the audio sampling format

uint64_t out_ch_layout;
int out_sample_rate;            / / sampling rate
int out_ch_layout_nb;           // Outputs the number of channels, used together with swr_CTx
int max_audio_frame_size;       // Maximum buffer data size
enum AVSampleFormat out_sample_fmt; // Output audio sampling format

// Schedule dependent
AVRational time_base;           // scale, used to calculate progress
double total_time;              // Total duration (seconds)
double current_time;            // Current progress
int isPlay = 0;                 // Playing state 1: Playing

/ / multi-threaded
pthread_t decodeId;             // Decode the thread ID
pthread_t playId;               // Play thread ID
pthread_mutex_t mutex;          / / synchronization locks
pthread_cond_t not_full;        // Not a full condition, used when producing AVFrame
pthread_cond_t not_empty;       // Not empty condition, used when consuming AVFrame

//Open SL ES
SLObjectItf engineObject;       // Engine object
SLEngineItf engineItf;          // Engine interface
SLObjectItf mixObject;          // Outputs the mix object
SLObjectItf playerObject;       // Player object
SLPlayItf playItf;              // Player interface
SLAndroidSimpleBufferQueueItf bufferQueueItf;   // Buffer interface
Copy the code

8.2 Decoding and playing process of the player

int createPlayer(a);                     // Create a player
int initCodecs(char **pathArr);         // Initialize the decoder
int initSwrContext(a);                   // Initialize SwrContext
int initFilters(a);                      // Initialize the filter
Copy the code

Instead, pass in the constructor an array of audio files, and the number of files, and initialize the related methods

AudioPlayer::AudioPlayer(char **pathArr, int len) {
    / / initialization
    fileCount = len;
    // Default volume 1.0 speed 1.0
    volumes = (char* *)malloc(fileCount * sizeof(char *));
    for (int i = 0; i < fileCount; i++) {
        volumes[i] = "1.0";
    }
    tempo = "1.0";

    pthread_mutex_init(&mutex, NULL);
    pthread_cond_init(&not_full, NULL);
    pthread_cond_init(&not_empty, NULL);

    initCodecs(pathArr);
    avfilter_register_all();
    initSwrContext();
    initFilters();
    createPlayer();
}
Copy the code

Here we also initialize variables that control the volume and speed of each audio, synchronization locks, and condition variables (production consumption control).

8.3 Implementation

8.3.1 Initializing the decoder Array

int AudioPlayer::initCodecs(char **pathArr) {
    LOGI("init codecs");
    av_register_all();
    fmt_ctx_arr = (AVFormatContext **) malloc(fileCount * sizeof(AVFormatContext *));
    codec_ctx_arr = (AVCodecContext **) malloc(fileCount * sizeof(AVCodecContext *));
    stream_index_arr = (int *) malloc(fileCount * sizeof(int));
    for (int n = 0; n < fileCount; n++) {
    	// Initialize the context, open the file, get the audio index. stream_index_arr[n] = audio_stream_index;// Get the decoder
        AVCodecContext *codec_ctx = avcodec_alloc_context3(NULL);
        codec_ctx_arr[n] = codec_ctx;
        AVStream *stream = fmt_ctx->streams[audio_stream_index];
        avcodec_parameters_to_context(codec_ctx, fmt_ctx->streams[audio_stream_index]->codecpar);
        AVCodec *codec = avcodec_find_decoder(codec_ctx->codec_id);
        if (n == 0) {// Get the input format
            in_sample_fmt = codec_ctx->sample_fmt;
            in_ch_layout = codec_ctx->channel_layout;
            in_sample_rate = codec_ctx->sample_rate;
            in_ch_layout_nb = av_get_channel_layout_nb_channels(in_ch_layout);
            max_audio_frame_size = in_sample_rate * in_ch_layout_nb;
            time_base = fmt_ctx->streams[audio_stream_index]->time_base;
            int64_t duration = stream->duration;
            total_time = av_q2d(stream->time_base) * duration;
            LOGI("total time:%lf", total_time);
        } else {// If there are multiple files, determine whether the format is consistent (adoption rate, format, number of channels)
            if(in_ch_layout ! = codec_ctx->channel_layout || in_sample_fmt ! = codec_ctx->sample_fmt || in_sample_rate ! = codec_ctx->sample_rate) { LOGE("Input file format different");
                return - 1; }}// Open the decoder
        if (avcodec_open2(codec_ctx, codec, NULL) < 0) {
            LOGE("could not open codec");
            return - 1; }}return 1;
}
Copy the code

The format information of the input audio is saved here for SwrContext initialization and Filter initialization.

8.3.2 Initializing the Filter Array

int AudioPlayer::initFilters() {
    LOGI("init filters");
    graph = avfilter_graph_alloc();
    srcs = (AVFilterContext **) malloc(fileCount * sizeof(AVFilterContext **));
    char args[128];
    AVDictionary *dic = NULL;
    // Mix filter
    AVFilter *amix = avfilter_get_by_name("amix");
    AVFilterContext *amix_ctx = avfilter_graph_alloc_filter(graph, amix, "amix");
    snprintf(args, sizeof(args), "inputs=%d:duration=first:dropout_transition=3", fileCount);
    if (avfilter_init_str(amix_ctx, args) < 0) {
        LOGE("error init amix filter");
        return - 1;
    }

    const char *sample_fmt = av_get_sample_fmt_name(in_sample_fmt);
    snprintf(args, sizeof(args), "sample_rate=%d:sample_fmt=%s:channel_layout=0x%" PRIx64,
             in_sample_rate, sample_fmt, in_ch_layout);

    for (int i = 0; i < fileCount; i++) {
    	// The abuffer,volume filter corresponding to each input is initialized here.// Then connect to amix
        if (avfilter_link(volume_ctx, 0, amix_ctx, i) < 0) {
            LOGE("error link to amix filter");
            return - 1; }}// Variable speed filter atempo
    AVFilter *atempo = avfilter_get_by_name("atempo");
    // Set variable speed parameters.// Initializes the aformat filter for output format conversion

    AVFilter *aformat = avfilter_get_by_name("aformat");
    AVFilterContext *aformat_ctx = avfilter_graph_alloc_filter(graph, aformat, "aformat");
    snprintf(args, sizeof(args), "sample_rates=%d:sample_fmts=%s:channel_layouts=0x%" PRIx64,
             in_sample_rate, sample_fmt, in_ch_layout);
    if (avfilter_init_str(aformat_ctx, args) < 0) {
        LOGE("error init aformat filter");
        return - 1;
    }
    // Output buffer
    AVFilter *abuffersink = avfilter_get_by_name("abuffersink");
    // Set the abuffersink parameter.// Link amix to atempo
    if (avfilter_link(amix_ctx, 0, atempo_ctx, 0) < 0) {
        LOGE("error link to atempo filter");
        return - 1;
    }
    if (avfilter_link(atempo_ctx, 0, aformat_ctx, 0) < 0) {
        LOGE("error link to aformat filter");
        return - 1;
    }
    if (avfilter_link(aformat_ctx, 0, sink, 0) < 0) {
        LOGE("error link to abuffersink filter");
        return - 1;
    }
    if (avfilter_graph_config(graph, NULL) < 0) {
        LOGE("error config graph");
        return - 1;
    }

    return 1;
}
Copy the code

The input audio format information obtained by initializing the decoder can initialize the Abuffer input filter (sampling rate, format, and sound channel must match), and then the volume, Amix, and Atempo filters can be linked. In this way, the audio can be tuned, mixed and changed.

8.3.3 Initializing SwrContext

int AudioPlayer::initSwrContext() {
    LOGI("init swr context");
    swr_ctx = swr_alloc();
    out_sample_fmt = AV_SAMPLE_FMT_S16;
    out_ch_layout = AV_CH_LAYOUT_STEREO;
    out_ch_layout_nb = 2;
    out_sample_rate = in_sample_rate;
    max_audio_frame_size = out_sample_rate * 2;

    swr_alloc_set_opts(swr_ctx, out_ch_layout, out_sample_fmt, out_sample_rate, in_ch_layout,
                       in_sample_fmt, in_sample_rate, 0.NULL);
    if (swr_init(swr_ctx) < 0) {
        LOGE("error init SwrContext");
        return - 1;
    }
    return 1;
}
Copy the code

In order to enable the decoded AVFrame to play under OpenSL ES, we will adopt the fixed format of 16-bit AV_SAMPLE_FMT_S16, the sound channel is AV_CH_LAYOUT_STEREO, the number of sound channel is 2, and the sampling rate is the same as the input. The maximum value of buffer callback PCM data is sampling rate *2.

8.3.4 Initializing OpenSL ES Player

int AudioPlayer::createPlayer() {
    // Create a player
    // Create and initialize the engine object
// SLObjectItf engineObject;
    slCreateEngine(&engineObject, 0.NULL.0.NULL.NULL);
    (*engineObject)->Realize(engineObject, SL_BOOLEAN_FALSE);
    // Get the engine interface
// SLEngineItf engineItf;
    (*engineObject)->GetInterface(engineObject, SL_IID_ENGINE, &engineItf);
    // Get the output mix through the engine interface
// SLObjectItf mixObject;
    (*engineItf)->CreateOutputMix(engineItf, &mixObject, 0.0.0);
    (*mixObject)->Realize(mixObject, SL_BOOLEAN_FALSE);

    // Set player parameters
    SLDataLocator_AndroidSimpleBufferQueue
            android_queue = {SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 2};
    SLuint32 samplesPerSec = (SLuint32) out_sample_rate * 1000;
    / / PCM format
    SLDataFormat_PCM pcm = {SL_DATAFORMAT_PCM,
                            2./ / two channels
                            samplesPerSec,
                            SL_PCMSAMPLEFORMAT_FIXED_16,
                            SL_PCMSAMPLEFORMAT_FIXED_16,
                            SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,//
                            SL_BYTEORDER_LITTLEENDIAN};

    SLDataSource slDataSource = {&android_queue, &pcm};

    // Output pipe
    SLDataLocator_OutputMix outputMix = {SL_DATALOCATOR_OUTPUTMIX, mixObject};
    SLDataSink audioSnk = {&outputMix, NULL};

    const SLInterfaceID ids[3] = {SL_IID_BUFFERQUEUE, SL_IID_EFFECTSEND, SL_IID_VOLUME};
    const SLboolean req[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};
    // Create and initialize the player object through the engine interface
// SLObjectItf playerObject;
    (*engineItf)->CreateAudioPlayer(engineItf, &playerObject, &slDataSource, &audioSnk, 1, ids,
                                    req);
    (*playerObject)->Realize(playerObject, SL_BOOLEAN_FALSE);

    // Get the playback interface
// SLPlayItf playItf;
    (*playerObject)->GetInterface(playerObject, SL_IID_PLAY, &playItf);
    // Get the buffer interface
// SLAndroidSimpleBufferQueueItf bufferQueueItf;
    (*playerObject)->GetInterface(playerObject, SL_IID_BUFFERQUEUE, &bufferQueueItf);

    // Register buffer callback
    (*bufferQueueItf)->RegisterCallback(bufferQueueItf, _playCallback, this);
    return 1;
}
Copy the code

The PCM format must be consistent with the parameters set by SwrContext

8.3.5 Starting the playback thread and decoding thread

void *_decodeAudio(void *args) {
    AudioPlayer *p = (AudioPlayer *) args;
    p->decodeAudio();
    pthread_exit(0);
}

void *_play(void *args) {
    AudioPlayer *p = (AudioPlayer *) args;
    p->setPlaying();
    pthread_exit(0);
}

void AudioPlayer::setPlaying() {
    // Set the playback state
    (*playItf)->SetPlayState(playItf, SL_PLAYSTATE_PLAYING);
    _playCallback(bufferQueueItf, this);
}

void AudioPlayer::play() {
    isPlay = 1;
    pthread_create(&decodeId, NULL, _decodeAudio, this);
    pthread_create(&playId, NULL, _play, this);
}
Copy the code

In the play method we pthread_create start the play and decode threads. The play thread sets the play state through the play interface, and then calls the buffer interface. In the callback, we take the AVFrame from the queue and convert it to PCM, and then play through the Enqueue. The decoding thread is responsible for decoding and filtering out the AVFrame and adding it to the queue.

8.3.6 Buffer callback

void _playCallback(SLAndroidSimpleBufferQueueItf bq, void *context) {
    AudioPlayer *player = (AudioPlayer *) context;
    AVFrame *frame = player->get();
    if (frame) {
        int size = av_samples_get_buffer_size(NULL, player->out_ch_layout_nb, frame->nb_samples,
                                              player->out_sample_fmt, 1);
        if (size > 0) {
            uint8_t *outBuffer = (uint8_t *) av_malloc(player->max_audio_frame_size);
            swr_convert(player->swr_ctx, &outBuffer, player->max_audio_frame_size,
                        (const uint8_t**) frame->data, frame->nb_samples); (*bq)->Enqueue(bq, outBuffer, size); }}}Copy the code

8.3.7 Decoding and Filtering

void AudioPlayer::decodeAudio() {
    LOGI("start decode...");
    AVFrame *frame = av_frame_alloc();
    AVPacket *packet = av_packet_alloc();
    int ret, got_frame;
    int index = 0;
    while (isPlay) {
        LOGI("decode frame:%d", index);
        for (int i = 0; i < fileCount; i++) {
            AVFormatContext *fmt_ctx = fmt_ctx_arr[i];
            ret = av_read_frame(fmt_ctx, packet);
            if(packet->stream_index ! = stream_index_arr[i])continue;// Not audio packet skipped
            if (ret < 0) {
                LOGE("decode finish");
                goto end;
            }
            ret = avcodec_decode_audio4(codec_ctx_arr[i], frame, &got_frame, packet);
            if (ret < 0) {
                LOGE("error decode packet");
                goto end;
            }
            if (got_frame <= 0) {
                LOGE("decode error or finish");
                goto end;
            }
            ret = av_buffersrc_add_frame(srcs[i], frame);
            if (ret < 0) {
                LOGE("error add frame to filter");
                goto end;
            }
        }
        LOGI("time:%lld,%lld,%lld", frame->pkt_dts, frame->pts, packet->pts);
        while (av_buffersink_get_frame(sink, frame) >= 0) {
            frame->pts = packet->pts;
            LOGI("put frame:%d,%lld", index, frame->pts);
            put(frame);
        }
        index++;
    }
    end:
    av_packet_unref(packet);
    av_frame_unref(frame);
}
Copy the code

One point to note here is that the packet read by av_read_frame is not necessarily an audio stream, so you need to filter the packet by an audio stream index. In the AVFrame obtained by AV_buffersink_get_frame, PTS is changed to PTS in packet to save the progress (filtered PTS time progress is not the current decoding progress).

8.3.8 AVFrame storage and retrieval

/** * Add AVFrame to queue, queue length 5, block waiting * @param frame * @return */
int AudioPlayer::put(AVFrame *frame) {
    AVFrame *out = av_frame_alloc();
    if (av_frame_ref(out, frame) < 0)return - 1;/ / copy AVFrame
    pthread_mutex_lock(&mutex);
    if (queue.size() == 5) {
        LOGI("queue is full,wait for put frame:%d".queue.size());
        pthread_cond_wait(&not_full, &mutex);
    }
    queue.push_back(out);
    pthread_cond_signal(&not_empty);
    pthread_mutex_unlock(&mutex);
    return 1;
}

/** * if AVFrame is empty, block waiting * @return */
AVFrame *AudioPlayer::get() {
    AVFrame *out = av_frame_alloc();
    pthread_mutex_lock(&mutex);
    while (isPlay) {
        if (queue.empty()) {
            pthread_cond_wait(&not_empty, &mutex);
        } else {
            AVFrame *src = queue.front();
            if (av_frame_ref(out, src) < 0)return NULL;
            queue.erase(queue.begin());// Remove the fetched element
            av_free(src);
            if (queue.size() < 5)pthread_cond_signal(&not_full);
            pthread_mutex_unlock(&mutex);
            current_time = av_q2d(time_base) * out->pts;
            LOGI("get frame:%d,time:%lf".queue.size(), current_time);
            return out;
        }
    }
    pthread_mutex_unlock(&mutex);
    return NULL;
}
Copy the code

Through two condition variables, a production and consumption model with a buffer of 5 is realized, which is used to store and fetch AVFrame queue. Through the above code can achieve the volume of 1, speed of 1 multi audio playback

9. NDK playback control

Having created an FFMPEg-based player in the previous section, this section begins with various controls for the player. There are mainly tuning, variable speed, pause, play, progress switch, stop (release resources).

9.1 create FFmpegAudioPlayer

Start by creating ffmpegAudioPlayer.kt (kotlin) in the Java layer and adding the following methods for JNI

class FFmpegAudioPlayer {
    /** * initializes */
    external fun init(paths: Array<String>)
    /** * play */
    external fun play(a)

    /** * pause */
    external fun pause(a)

    /** * Release resources */
    external fun release(a)

    /** * modify each volume */
    external fun changeVolumes(volumes: Array<String>)

    /**
     * 变速
     */
    external fun changeTempo(tempo: String)

    /** * Total duration in seconds */
    external fun duration(a): Double

    /** * Current progress in seconds */
    external fun position(a): Double

    /** * progress jump */
    external fun seek(sec: Double)

    companion object {
        init {
            System.loadLibrary("avutil-55")
            System.loadLibrary("swresample-2")
            System.loadLibrary("avcodec-57")
            System.loadLibrary("avfilter-6")
            System.loadLibrary("swscale-4")
            System.loadLibrary("avformat-57")
            System.loadLibrary("native-lib")}}}Copy the code

Then in the JNI layer, the corresponding method is implemented.

#include "AudioPlayer.h"
static AudioPlayer *player;
extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_init(
        JNIEnv *env,
        jobject /* this */, jobjectArray _srcs) {
    jsize len = env->GetArrayLength(_srcs);
    char **pathArr = (char* *)malloc(len * sizeof(char *));
    int i = 0;
    for (i = 0; i < len; i++) {
        jstring str = static_cast<jstring>(env->GetObjectArrayElement(_srcs, i));
        pathArr[i] = const_cast<char *>(env->GetStringUTFChars(str, 0));
    }
    player = new AudioPlayer(pathArr, len);
}

extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_changeVolumes(
        JNIEnv *env,
        jobject /* this */, jobjectArray _volumes) {
    jsize len = env->GetArrayLength(_volumes);
    int i = 0;
    for (i = 0; i < len; i++) {
        jstring str = static_cast<jstring>(env->GetObjectArrayElement(_volumes, i));
        char *volume = const_cast<char *>(env->GetStringUTFChars(str, 0));
        player->volumes[i] = volume;
    }
    player->change = 1;// Modify filter parameters by marking surface parameter changes with change
}

extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_changeTempo(
        JNIEnv *env,
        jobject /* this */, jstring _tempo) {
    char *tempo = const_cast<char *>(env->GetStringUTFChars(_tempo, 0));
    player->tempo = tempo;
    player->change = 1;
}
extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_play(
        JNIEnv *env,
        jobject /* this */) {
    player->play();
}

extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_pause(
        JNIEnv *env,
        jobject /* this */) {
    player->pause();
}

extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_release(
        JNIEnv *env,
        jobject /* this */) {
    player->release();
}
extern "C" JNIEXPORT void
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_seek(
        JNIEnv *env,
        jobject /* this */, jdouble secs) {
    player->seek(secs);
}

extern "C" JNIEXPORT jdouble
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_duration(
        JNIEnv *env,
        jobject /* this */) {
    return player->total_time;
}

extern "C" JNIEXPORT jdouble
JNICALL
Java_io_github_iamyours_ffmpegaudioplayer_FFmpegAudioPlayer_position(
        JNIEnv *env,
        jobject /* this */) {
    return player->current_time;
}
Copy the code

The final implementation is in audioplayer.cpp

9.2 Tuning and variable speed

In order to achieve speed change, tuning, we need to modify the filter parameters before decoding. Here, a change parameter is used as a marker to indicate that filter needs to be reinitialized. After initialization, change needs to be reinitialized to 0.

int AudioPlayer::initFilters() {
    LOGI("init filters");
    if(change)avfilter_graph_free(&graph); graph = avfilter_graph_alloc(); . change =0;
    return 1;
}
Copy the code

The previous filter resources need to be freed to avoid memory overflow. Before decoding, reinitialize with the change flag.

void AudioPlayer::decodeAudio() {
    ...
    while (isPlay) {
        LOGI("decode frame:%d", index);
        if (change) {
            initFilters();
        }
        for (int i = 0; i < fileCount; i++) {
            AVFormatContext *fmt_ctx = fmt_ctx_arr[i];
            ret = av_read_frame(fmt_ctx, packet);
            if(packet->stream_index ! = stream_index_arr[i])continue; . ret = av_buffersrc_add_frame(srcs[i], frame);if (ret < 0) {
                LOGE("error add frame to filter");
                gotoend; }}while (av_buffersink_get_frame(sink, frame) >= 0) {
            frame->pts = packet->pts;
            put(frame);
        }
        index++;
    }
    end:
   ...
}
Copy the code

In this way, volume and speed control can be achieved.

9.3 Pause and Play

Pause can be paused by setting the pause state through the OpenSLES player interface. When this state is set, the buffer callback suspends the callback.

void AudioPlayer::pause() {
    (*playItf)->SetPlayState(playItf, SL_PLAYSTATE_PAUSED);
}
Copy the code

For replaying, we only need to set the SL_PLAYSTATE_PLAYING state

void AudioPlayer::play() {
    LOGI("play...");
    if (isPlay) {
        (*playItf)->SetPlayState(playItf, SL_PLAYSTATE_PLAYING);
        return;
    }
    isPlay = 1;
    seek(0);
    pthread_create(&decodeId, NULL, _decodeAudio, this);
    pthread_create(&playId, NULL, _play, this);
}
Copy the code

9.4 Schedule Control

Progress control is implemented using AV_seek_frame, and av_Q2D is used to convert seconds to ffMPEG internal timestamps

void AudioPlayer::seek(double secs) {
    pthread_mutex_lock(&mutex);
    for (int i = 0; i < fileCount; i++) {
        av_seek_frame(fmt_ctx_arr[i], stream_index_arr[i], (int64_t) (secs / av_q2d(time_base)),
                      AVSEEK_FLAG_ANY);
    }
    current_time = secs;
    queue.clear();
    pthread_cond_signal(&not_full);
    pthread_mutex_unlock(&mutex);
}
Copy the code

9.5 Releasing Resources

Set the player state to Stop, release Open SLES related resources, release filter resources, release decoder resources, close the input stream.

void AudioPlayer::release() {
    pthread_mutex_lock(&mutex);
    isPlay = 0;
    pthread_cond_signal(&not_full);
    pthread_mutex_unlock(&mutex);
    if (playItf)(*playItf)->SetPlayState(playItf, SL_PLAYSTATE_STOPPED);
    if (playerObject) {
        (*playerObject)->Destroy(playerObject);
        playerObject = 0;
        bufferQueueItf = 0;
    }
    if (mixObject) {
        (*mixObject)->Destroy(mixObject);
        mixObject = 0;
    }
    if (engineObject) {
        (*engineObject)->Destroy(engineObject);
        engineItf = 0;
    }
    if (swr_ctx) {
        swr_free(&swr_ctx);
    }
    if (graph) {
        avfilter_graph_free(&graph);
    }
    for (int i = 0; i < fileCount; i++) {
        avcodec_close(codec_ctx_arr[i]);
        avformat_close_input(&fmt_ctx_arr[i]);
    }
    free(codec_ctx_arr);
    free(fmt_ctx_arr);
    LOGI("release...");
}
Copy the code

9.6 Final Effect

The specific effect can be viewed by running the project