Here’s a question: How do I adjust the volume of sound while playing a video?

When we use the Android phone to play the video, we find the sound is too loud, we manually turn down the volume; Finding that the sound was low, we manually turned up the volume.

In this process, all rely on manual, if you are constantly scrolling short video, if the user needs to constantly manually adjust the volume keys, then the experience is unbearable.

This puts forward a requirement for us: can we decode the audio stream by adjusting the size of the original audio data through matrix operation, to achieve the purpose of adjusting the volume?

This idea is feasible, so let’s analyze the characteristics of sound, and then show how to do it.

Three characteristics of sound:

  • Pitch: The frequency of a sound is called Pitch. It is one of the three main subjective attributes of a sound, namely volume (loudness), Pitch, and timbre (also known as sound quality). The degree to which a person’s sense of hearing distinguishes the tone of a sound. Pitch is mainly determined by the frequency of the sound, but also related to the intensity of the sound
  • Loudness: people subjectively feel the size of the sound (commonly known as volume), determined by the “amplitude” (amplitude) and the distance from the source, the greater the amplitude of loudness, the smaller the distance from the source, the greater the loudness. (Unit: dB)
  • Timbre: Also known as tone, waveform determines the timbre of a sound. Sound has different characteristics because of the characteristics of different object materials. Timbre itself is an abstract thing, but waveform is the expression of this abstraction and intuition. Different timbre, different waveform. Typical timbre waveform is square wave, sawtooth wave, sine wave, pulse wave and so on. Different timbre, through the waveform, can be completely distinguished.

The amplitude of a sound wave indicates the volume of a sound:

Wavelength is a measure of the pitch of a sound:

Timbre is primarily concerned with the ripple of sound waves:

What we’re going to adjust here is loudness, which is the amplitude of the sound, and those of you who have studied matrix operations know that the amplitude of the sound can be adjusted by simple matrix operations. Filter: af: volume=3dB; filter: af: volume=3dB;

With the power of FFMPEG, it’s easy to do this, but can it be done with MediaCodec? We’re going to dynamically adjust sound amplitudes in ExoPlayer.

Defaultaudiossin. Java is the control class for playing audio in ExoPlayer. A variety of Audioprocessors are provided to process audio data.Main functions in AudioProcessor:

  AudioFormat configure(AudioFormat inputAudioFormat) throws UnhandledAudioFormatException;

  boolean isActive();

  void queueInput(ByteBuffer buffer);

  void queueEndOfStream();

  ByteBuffer getOutput();

  boolean isEnded();

  void flush();

  void reset();
Copy the code
  • Configure: Configures the current AudioFormat
  • IsActive: Indicates whether the current AudioProcessor is available
  • QueueInput: Input input buffer. This ByteBuffer is the raw data
  • QueueEndOfStream: There is no more data in the queue
  • GetOutput: The processed ByteBuffer data is sent to DefaultAudioSink to start AudioTrack
  • ChannelMappingAudioProcessor
  • FloatResamplingAudioProcessor: 24 – bit and 32 bit integer audio into 32 – bit floating point audio, integer and floating point of different bits wide, the voice of the floating-point performance appear more delicate
  • ResamplingAudioProcessor: Resamples the audio data of other bits to 16-bit audio data
  • SilenceSkippingAudioProcessor: skip the mute
  • TeeAudioProcessor
  • TrimmingAudioProcessor

SonicAudioProcessor is very important, depending on it to double the speed and adjust the amplitude of the sound. SonicAudioProcessor adjusts pitch, speed, volume changes according to Sonic algorithm.

We are only talking about how to adjust the amplitude of sound: github.com/JeffMony/Pl…

private void processStreamInput() { // Resample as many pitch periods as we have buffered on the input. int originalOutputFrameCount = outputFrameCount; float s = speed / pitch; float r = rate * pitch; If (s > 1.00001 | | s < 0.99999) {changeSpeed (s); } else { copyToOutput(inputBuffer, 0, inputFrameCount); inputFrameCount = 0; } if (r ! AdjustRate (r, originalOutputFrameCount); } if(volume ! = 1.0 f) {/ / Adjust the output volume. ScaleSamples (outputBuffer originalOutputFrameCount, outputFrameCount - originalOutputFrameCount, volume); } } private void scaleSamples(short samples[], int position, int numSamples, Float Volume) {int fixedPointVolume = (int)(volume * 4096.0f); int start = position * channelCount; int stop = start + numSamples * channelCount; for(int xSample = start; xSample < stop; xSample++) { int value = (samples[xSample] * fixedPointVolume) >> 12; if(value > 32767) { value = 32767; } else if(value < -32767) { value = -32767; } samples[xSample] = (short)value; }}Copy the code

The volume here is a multiple of the update amplitude. For example, volume=1.2f, adjust the amplitude from the original value to 1.2 times.

How to calculate the decibel of sound? Result = 20 * log(Cur/Max) Cur indicates the current amplitude. Max indicates the maximum amplitude, so the decibel of the sound is always negative (on Android, it is). volume(dB) = 20 * log(Cur / Max)

  • Volume indicates the calculated decibel value
  • Max indicates the maximum amplitude
  • Cur represents the current amplitude

You can enter two parameters: MeanVolume and BaseVolume

  • MeanVolume: Average decibels
  • B: BaseVolume
BaseVolume - MeanVolume = result 20*log(CurBase/Max) - 20*log(Cur/Max) = result 20*log(CurBase/Cur)=result CurBase/Cur =  10^(result/20) CurBase = Cur * 10^(result/20)Copy the code

What we care about is 10^(result/20).

If you want to set the number in the ExoPlayer to 10^(result/20))

The source code in this article comes from the project: github.com/JeffMony/Pl…