There are many ways to capture audio and video in iOS. Examples include AVCaptureDevice, AudioQueue, and Audio Unit. Among them, Audio Unit is the lowest level interface, which has the advantages of powerful functions and low latency. The disadvantage is that the cost of learning is high, difficult. For general iOS applications, AVCaptureDevice and AudioQueue are sufficient. However, for live Audio and video broadcasting, it is better to use Audio Unit for processing, so as to achieve the best effect. The famous WebRTC uses Audio Unit to collect and play Audio. Today we will focus on the basic knowledge and use of Audio Unit.

Audio Unit’s position in the iOS architecture:

The basic concept

The types of Audio units can be divided into four categories and subdivided into seven:

The internal structure of the Audio Unit is divided into two parts, Scope and Element. Scope is divided into three types, namely input scope, Output scope and global scope. An element is part of the input scope or output scope.

The following figure shows an Audio Unit of I/O type. Its input is a microphone and its output is a speaker. This is the simplest example of how to use Audio Unit.


The input element is element 1 (mnemonic device: the letter “I” of the word “Input” has an appearance similar to the number 1)

The output element is element 0 (mnemonic device: the letter “O” of the word “Output” has an appearance similar to the number 0)

Using the process summary description of audio components (kAudioUnitType_Output/kAudioUnitSubType_RemoteIO/kAudioUnitManufacturerApple) use Get AudioComponent AudioComponentFindNext (NULL, & descriptionOfAudioComponent). The AudioComponent is a bit like the factory that produces the Audio Unit. Using AudioComponentInstanceNew (ourComponent, & audioUnit) Audio Unit instance. Use the AudioUnitSetProperty function to enable IO for recording and playback. Using audio formats described AudioStreamBasicDescription structure, and using AudioUnitSetProperty Settings. Use the AudioUnitSetProperty to set the audio recording and playback callbacks. Allocate buffers. Example Initialize the Audio Unit. Start Audio Unit. Initialize the

Initialization looks like this. We have a member variable of type AudioComponentInstance that stores the Audio Unit.

The following audio format uses a 16-bit table as a sample.

#define kOutputBus 0#define kInputBus 1// … OSStatus status; AudioComponentInstance audioUnit; / / describe audio element AudioComponentDescription desc; desc.componentType = kAudioUnitType_Output; desc.componentSubType = kAudioUnitSubType_RemoteIO; desc.componentFlags = 0; desc.componentFlagsMask = 0; desc.componentManufacturer = kAudioUnitManufacturer_Apple; AudioComponent inputComponent = AudioComponentFindNext(NULL, &desc); / / get the Audio Unitstatus = AudioComponentInstanceNew (inputComponent, & audioUnit); checkStatus(status); // Enable IOUInt32 flag = 1 for recording; status = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, kInputBus, &flag, sizeof(flag)); checkStatus(status); IOstatus = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, kOutputBus, &flag, sizeof(flag)); checkStatus(status); // Description format Audioformat. mSampleRate = 44100.00; audioFormat.mFormatID = kAudioFormatLinearPCM; audioFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked; audioFormat.mFramesPerPacket = 1; audioFormat.mChannelsPerFrame = 1; audioFormat.mBitsPerChannel = 16; audioFormat.mBytesPerPacket = 2; audioFormat.mBytesPerFrame = 2; Status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, kInputBus, &audioFormat, sizeof(audioFormat)); checkStatus(status); status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, kOutputBus, &audioFormat, sizeof(audioFormat)); checkStatus(status); // Set the data collection callback AURenderCallbackStruct callbackStruct; callbackStruct.inputProc = recordingCallback; callbackStruct.inputProcRefCon = self; status = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Global, kInputBus, &callbackStruct, sizeof(callbackStruct)); checkStatus(status); // Sets the sound output callback function. When the speaker needs data, the callback function is called to get it. It’s the idea of pulling data. callbackStruct.inputProc = playbackCallback; callbackStruct.inputProcRefCon = self; status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Global, kOutputBus, &callbackStruct, sizeof(callbackStruct)); checkStatus(status); // Turn off the buffer allocated for recording (we want to use our own allocation) flag = 0; status = AudioUnitSetProperty(audioUnit, kAudioUnitProperty_ShouldAllocateBuffer, kAudioUnitScope_Output, kInputBus, &flag, sizeof(flag)); Status = AudioUnitInitialize(audioUnit); checkStatus(status);

Open the Audio Unit

OSStatus status = AudioOutputUnitStart(audioUnit); checkStatus(status);

Close the Audio Unit

OSStatus status = AudioOutputUnitStop(audioUnit); checkStatus(status);

The end of the Audio Unit


Recording the callback

static OSStatus recordingCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData) {// TODO:// Use inNumberFrames to calculate how much data is valid // There is more valid space in the AudioBufferList *bufferList; // Buffers are stored in the bufferList. The length of buffers is dynamic. OSStatus status; status = AudioUnitRender([audioInterface audioUnit], ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, bufferList); checkStatus(status); // Now the sample data we want is in buffers in the bufferList. DoStuffWithTheRecordedAudio(bufferList); return noErr; }

Play the callback

static OSStatus playbackCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags, const AudioTimeStamp *inTimeStamp, UInt32 inBusNumber, UInt32 inNumberFrames, AudioBufferList *ioData) { // Notes: Fill as many buffers as possible into ioData. Remember to set the size of each buffer to match that of the buffers. return noErr; }

The end of the

The Audio Unit does a lot of really cool stuff. Such as mixing, audio effects, recording and so on. It is at the bottom of the iOS development architecture and is particularly suitable for use in audio and video live scenarios.

“Knowledge is infinite, and take only what I need.”

Ios Audio and video demo App:…