What is PCM?

PCM: Pulse Code Modulation (PCM) is a common coding format that transforms from analog signal to digital signal. It is called Pulse Code Modulation. PCM divides the analog signal into multiple segments according to a certain spacing, and then quantifies the intensity of each spacing through binary.

PCM represents the amplitude of a piece of audio in an audio file over time. Android supports PCM audio data in WAV files.

Advantages: Maximum approximation to absolute fidelity. Disadvantages: Large size.

Second, the recording process

Recording is the encoding of analog to digital signals.

AndioRecord gives the application layer the ability to collect data about sound related hardware. The working process is as follows:

  • 1. Construct an AudioRecord object, and obtain the required minimum recording buffer size by getMinBufferSize method.
  • 2. Initialize a buffer buffer that is greater than or equal to the size of the AudioRecord buffer used to write sound data in step 1.
  • AudioRecord#startRecording()
  • 4. Read the audio cache data from the AudioRecord into the buffer initialized in step 2, create a file data write stream, import the buffer data into the data stream and write it to the local file. This procedure can be applied to the upper layer to call back the recording duration.
  • 5. Close the data stream
  • AudioRecord#stop()

So how do I play what I recorded in the player?

PCM files are raw files and cannot be recognized by player. WAR files can be recognized because of WAVE HEAD.

AudioRecord and MediaRecord

Can record, but MediaRecord is more advanced.

Attached: AudioRecord build parameter

  • AudioSource: audio hardware source MediaRecorder. AudioSource. MIC
  • SampleRateInHz: sampling frequency, unit: Hertz. Suggestion: 44100Hz
  • ChannelConfig: Mono or double channel
  • AudioFormat: Audio data precision, such as PCM8 and PCM16
  • BufferSizeInBytes: The minimum buffer size required by the AudioRecord,native method, which is important for the application layer to allocate buffer pool size.
Public AudioRecord(int sampleRateInHz, int channelConfig, int audioFormat, int bufferSizeInBytes)Copy the code

Three, sound playback

Playback is the digital signal to analog signal decoding.

AudioTrack provides the application layer with the ability to transmit/write PCM audio as a buffered stream to a hardware audio receiver for playback. There are two transmission modes: static and streaming.

Static:

STATIC: writes data to the buffer at a time. The advantage is high timeliness, but the disadvantage is limited data amount.

Streaming:

Stream, similar to an IO write file, is copied from the user buffer pool to the AudioTracker. Advantages: Large amount of data and high sampling rate. Disadvantages: Prone to delay.

Each AudioTrack will be registered in AudioFlinger when it is created. The AudioFlinger will mix all the Audiotracks (AudioMixer) and then send them to AudioHardware for playback. Currently, Android can create up to 32 audio streams simultaneously, meaning that Mixer can process up to 32 AudioTrack data streams simultaneously.

Play the work process is as follows: and AudioRecord the same truth, but the process is the opposite.

  • 1. Construct the AudioTrack object and obtain the required minimum buffer size of the playback buffer through getMinBufferSize method.
  • 2. Initialize a buffer buffer that is greater than or equal to the size of the buffer AudioTrack used to write sound data in Step 1.
  • AudioTrack#play()
  • 4. Create a file data read stream that reads the sound cache data into the buffer initialized in step 2. The buffer data is sent to AudioHardware via AudioTrack#write.
  • 5. Close the data stream
  • 6, Stop playing, AudioTrack#stop()

AudioTrack and MediaPlayer, SoundPool, and usage scenarios?

  • AudioTrack can only play PCM and WAR files that do not require decoding

  • MediaPlayer still creates AudioTrack on the framework layer and passes the decoded PCM number to AudioTrack, which then passes it to AudioFlinger for mixing. And then it’s passed to the hardware to play, so MediaPlayer contains AudioTrack. MediaPlayer is better suited for playing long-running local music files or streaming resources online in the background.

  • SoundPool is good for short audio clips, like game sounds, buttons, ringtone clips, etc. It can play multiple audio sounds at the same time. Is it possible to use this for car navigation mixing?

Attached: AudioTrack initialization process

  • StreamType: Type of an audio stream, for example, audiomanager.stream_music.
  • SampleRateInHz: sampling frequency, unit: Hertz. Suggestion: 44100Hz.
  • ChannelConfig: Mono or double channel.
  • AudioFormat: Audio data precision, such as PCM8 and PCM16.
  • BufferSizeInBytes: The minimum buffer size required by the AudioRecord,native method, which is important for the application layer to allocate buffer pool size.
  • Mode: audiotrack. MODE_STREAM.
public AudioTrack(int streamType, 
int sampleRateInHz, 
int channelConfig, 
int audioFormat,
int bufferSizeInBytes, 
int mode)
Copy the code

Four, MediaCodec

MediaCodec Audio and video codec component, responsible for audio (AAC)/ video (h.264) codec. Often used with: MediaExtractor, MediaMuxer, Surface, and AudioTrack.

Taking processing input data In to produce output data Out as an example, its working process is as follows:

  • 1. Create an empty input buffer A, fill it with data and send it to the MediaCodec for processing.
  • 2. The MediaCodec takes the input data A and converts it to an empty output buffer B.
  • 3. Finally, the application layer takes data B from the output buffer and consumes it.
  • 3. Release the MediaCodec

MediaCodec Lifecycle:

Stopped(Uninitialized, Configured, Error)

Executing (Flushed, Running, End – of – the Stream)

Released

  • 1. The codec is uninitialized when created.

    First you need to call configure(…). The start() method makes it Configured and then calls the start() method to make it a Executing state. In Executing, you can use the above buffer to process data.

  • 2. Uploading state is also divided into three uploading states: Running state, end-of-stream state.

    After the Flushed () call, the uploading codec is as the uploading state, where it saves all the buffers. Once the first input buffer is present, the codec automatically runs to the Running state. When a buffer with the end-of-stream flag is entered, the codec enters the end-of-stream state. In this state, the codec no longer accepts the input buffer, but still generates the output buffer. At this point you can call the flush() method to reset the uploading codec to the Flushed state.

  • 3. Call stop() to return the codec to an uninitialized state, which can then be reconfigured.

    Once you have finished using the codec, you must release it by calling release().

  • 4. In rare cases, the codec may encounter an error and go to an error state.

    This is conveyed using invalid return values from queued operations or sometimes via exceptions. Call reset() to make the codec available again. You can call it from any state to move the codec back to the uninitialized state. Otherwise, call Release () to the terminal release state.

MediaCodec flow control

Generally, encoders can set a target bit rate, but the actual output bit rate of encoders will not fully conform to the setting, because the actual control in the encoding process is not the final output bit rate, but a Quantization Parameter (QP) in the encoding process, which has no fixed relationship with the bit rate. It depends on the content of the image.

There are not many interfaces related to MediaCodec flow control. One is to set the target bit rate and the bit rate control mode during configuration, and the other is to dynamically adjust the target bit rate (Android 19 and above).

There are three bit rate control modes:

  • CQ means that the bit rate is not controlled at all and the image quality is guaranteed as far as possible.
  • CBR means that the encoder will try its best to control the output bit rate to a set value, that is, “unmoved” mentioned above.
  • VBR means that the encoder will dynamically adjust the output bit rate according to the complexity of the image content (actually the size of the inter-frame variation). The complex image will have a high bit rate, while the simple image will have a low bit rate.

Attached, API description

  • GetInputBuffers: Gets the queue of input streams that need to encode data, returning a ByteBuffer array
  • QueueInputBuffer: Input flows into the queue
  • DequeueInputBuffer: Encodes data from the input stream queue
  • GetOutputBuffers: Gets the codec data output stream queue, returning a ByteBuffer array
  • DequeueOutputBuffer: Fetch the data from the output queue after the encoding operation
  • ReleaseOutputBuffer: ByteBuffer data is released after processing

Five, audio and video recording

Audio and video recording is the process of mixing audio track and video track into a file container such as MP4.

MediaMuxer, or multiplexer, provides the application layer with the ability to mix encoded video and audio streams into a single audio and video file.

MediaMuxer supports only one video track and one audio track at most, so if there are multiple audio tracks, you can first mix them into one audio track and then use MediaMuxer to package them into mp4 container.

Video capture with Camera class, video preview with SurfaceView, audio capture with AudioRecord.

The working process is as follows :(video content collection –> coding –> file storage)

  • 1. Collect Camera data and transcode it into H264 and store it in a file. Use SurfaceView to preview the data collected by Camera.
  • 2. Enable two threads to process audio and video data respectively. Logical process: MediaCodec uses queueInputBuffer to fetch data and encodes the dequeueOutputBuffer to MediaMuxer.
  • 3. Add two tracks to MediaMuxer. MediaMuxer.writeSampleData(track, data.byteBuf, data.bufferInfo);

Add: MediaXer API

  • MediaMuxer(String path, int format) : path: name of the output file format: format of the output file. Currently, only MP4 is supported.
  • AddTrack (MediaFormat format) : Adds channels. Us more is to use MediaCodec. GetOutpurForma () or Extractor. GetTrackFormat (int index) to obtain MediaFormat; You can also create your own;
  • Start () : Starts synthesizing files
  • WriteSampleData (int trackIndex, ByteBuffer byteBuf, mediacodec.bufferInfo BufferInfo) : Writes data from ByteBuffer to a file set in the constructor;
  • Stop () : Stops the synthesis file
  • Release () : Releases resources

The attached:H264 encoding protocol

Six, audio and video playback

Audio and video playback is the process of separating audio track and video track and playing them separately.

MediaExtractor provides the application layer with a capability to separate audio and video data.

The working process is as follows:

  • 1, MediaExtractor extract resources and select tracks (video, audio)
  • 2, MediaCodec codec configuration, height/width/time, etc., and associate SurfaceView with MediaCodec
  • 3, cycle from MediaExtractor to take data into MediaCodec, at the same time MediaCodec returns data, identify: video playback state, and do specific processing
  • 4.1 Video playback, SurfaceView
  • 4.2 Audio playback, AudioTrack

MediaExtractor API:

  • SetDataSource (String Path) : you can set the local and network files
  • GetTrackCount () : Gets the number of channels in the source file
  • GetTrackFormat (int index) : Gets the channel format specified (index)
  • GetSampleTime () : Returns the current timestamp
  • ReadSampleData (ByteBuffer byteBuf, int offset) : Reads data from a specified channel into ByteBuffer at an offset;
  • Advance () : Reads the next frame
  • Release (): Releases resources after reading

reference

1. Audio and video development — Grey Falling

2. Audio and video development,MediaCodec API — Grey Drift

3, OpenGL series — basic drawing process

4, juejin. Cn/post / 684490…