IOS WebRTC subscription stream without access to the microphone solution

The cause of

When using OWT(Open WebRTC Tookit) to implement the live broadcast function in the APP, I found that as long as I joined the created room and subscribed to the stream in the room, I would obtain the microphone permission of the user. This is very unfriendly to users who just want to watch the live broadcast and don’t want to talk on the phone. We want the effect that the user only gets the microphone access when he or she is on the phone, and doesn’t get the microphone access at other times.

why

Looking through the source code, we found that in the official SDK of WebRTC, if AudioTrack is added for RTCPeerConnection, WebRTC will try to initialize the input and output of audio. After the Audio channel is successfully established, WebRTC will automatically complete the sound collection, transmission and playback. RTCAudioSession provides a useManualAudio property. Set it to true and the audio input/output switch will be controlled by the isAudioEnabled property. However, isAudioEnabled can only control audio input and output at the same time. Our product now needs to turn off the microphone function, which is not required when just subscribing to streams. Need to push the stream (connect the mic and other functions), must use the microphone, need to obtain the microphone permission.

WebRTC is specially designed for full-duplex VoIP call applications, so it must need to initialize the microphone, and there is no API to modify it.

The solution

At present, there is no official API, and the underlying code has not been implemented

// sdk/objc/native/src/audio/audio_device_ios.mm
int32_t AudioDeviceIOS::SetMicrophoneMute(bool enable) {
  RTC_NOTREACHED() < <"Not implemented";
  return - 1;
}
Copy the code

Analyzing the source code, you can find the use of AudioUnit in VoiceProcessingAudioUnit. OnDeliverRecordedData callback function to AudioDeviceIOS through VoiceProcessingAudioUnitObserver notice after get audio data

// sdk/objc/native/src/audio/voice_processing_audio_unit.mm
OSStatus VoiceProcessingAudioUnit::OnDeliverRecordedData(
    void* in_ref_con,
    AudioUnitRenderActionFlags* flags,
    const AudioTimeStamp* time_stamp,
    UInt32 bus_number,
    UInt32 num_frames,
    AudioBufferList* io_data) {
  VoiceProcessingAudioUnit* audio_unit =
      static_cast<VoiceProcessingAudioUnit*>(in_ref_con);
  return audio_unit->NotifyDeliverRecordedData(flags, time_stamp, bus_number,
                                               num_frames, io_data);
}
Copy the code

Characteristics of I/O units

The ABOVE I/O Unit has two elements, but they are independent. For example, you can use the EnableI /O property (kAudioOutputUnitProperty_EnableIO) to enable or disable an element independently, depending on your application’s needs. Each element has an Input scope and an Output scope.

I/O Unittheelement 1The input hardware that connects the audio, represented by the microphone in the figure above. Developers can only access controlOutput scope
I/O Unittheelement 0The output hardware that connects the audio, represented by the speaker in the figure above. Developers can only access controlInput scope

Input element is element 1 (word output “I”, similar to 1) Output element is element 0 (word output “O”, type 0)

By analyzing the Audio Unit, you can turn off the microphone as easily as you initialize the Audio Unit configuration by turning off input. The following code adds a isMicrophoneMute variable, the variable in RTCAudioSessionConfiguration Settings.

Code examples:

c++ // sdk/objc/native/src/audio/voice_processing_audio_unit.mm bool VoiceProcessingAudioUnit::Init() { RTC_DCHECK_EQ(state_, kInitRequired); // Create an audio component description to identify the Voice Processing // I/O audio unit. AudioComponentDescription vpio_unit_description; vpio_unit_description.componentType = kAudioUnitType_Output; vpio_unit_description.componentSubType = kAudioUnitSubType_VoiceProcessingIO; vpio_unit_description.componentManufacturer = kAudioUnitManufacturer_Apple; vpio_unit_description.componentFlags = 0; vpio_unit_description.componentFlagsMask = 0; // Obtain an audio unit instance given the description. AudioComponent found_vpio_unit_ref = AudioComponentFindNext(nullptr, &vpio_unit_description); // Create a Voice Processing IO audio unit. OSStatus result = noErr; result = AudioComponentInstanceNew(found_vpio_unit_ref, &vpio_unit_); if (result ! = noErr) { vpio_unit_ = nullptr; RTCLogError(@"AudioComponentInstanceNew failed. Error=%ld.", (long)result); return false; } // Enable input on the input scope of the input element. RTCAudioSessionConfiguration* webRTCConfiguration = [RTCAudioSessionConfiguration webRTCConfiguration]; if (webRTCConfiguration.isMicrophoneMute) { RTCLog("@Enable input on the input scope of the input element."); UInt32 enable_input = 1; result = AudioUnitSetProperty(vpio_unit_, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, kInputBus, &enable_input, sizeof(enable_input)); if (result ! = noErr) { DisposeAudioUnit(); RTCLogError(@"Failed to enable input on input scope of input element. " "Error=%ld.", (long)result); return false; } } else { RTCLog("@Not Enable input on the input scope of the input element."); } // Enable output on the output scope of the output element. UInt32 enable_output = 1; result = AudioUnitSetProperty(vpio_unit_, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, kOutputBus, &enable_output, sizeof(enable_output)); if (result ! = noErr) { DisposeAudioUnit(); RTCLogError(@"Failed to enable output on output scope of output element. " "Error=%ld.", (long)result); return false; } // Specify the callback function that provides audio samples to the audio // unit. AURenderCallbackStruct render_callback; render_callback.inputProc = OnGetPlayoutData; render_callback.inputProcRefCon = this; result = AudioUnitSetProperty( vpio_unit_, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Input, kOutputBus, &render_callback, sizeof(render_callback)); if (result ! = noErr) { DisposeAudioUnit(); RTCLogError(@"Failed to specify the render callback on the output bus. " "Error=%ld.", (long)result); return false; } // Disable AU buffer allocation for the recorder, we allocate our own. // TODO(henrika): not sure that it actually saves resource to make this call. if (webRTCConfiguration.isMicrophoneMute) { RTCLog("@Disable  AU buffer allocation for the recorder, we allocate our own."); UInt32 flag = 0; result = AudioUnitSetProperty( vpio_unit_, kAudioUnitProperty_ShouldAllocateBuffer, kAudioUnitScope_Output, kInputBus, &flag, sizeof(flag)); if (result ! = noErr) { DisposeAudioUnit(); RTCLogError(@"Failed to disable buffer allocation on the input bus. " "Error=%ld.", (long)result); return false; } } else { RTCLog("@NOT Disable AU buffer allocation for the recorder, we allocate our own."); } // Specify the callback to be called by the I/O thread to us when input audio // is available. The recorded samples can then be obtained by calling the // AudioUnitRender() method. if (webRTCConfiguration.isMicrophoneMute) { RTCLog("@Specify the callback to be called by the I/O thread to us when input audio"); AURenderCallbackStruct input_callback; input_callback.inputProc = OnDeliverRecordedData; input_callback.inputProcRefCon = this; result = AudioUnitSetProperty(vpio_unit_, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Global, kInputBus, &input_callback, sizeof(input_callback)); if (result ! = noErr) { DisposeAudioUnit(); RTCLogError(@"Failed to specify the input callback on the input bus. " "Error=%ld.", (long)result); return false; } } else { RTCLog("@NOT Specify the callback to be called by the I/O thread to us when input audio"); } state_ = kUninitialized; return true; }Copy the code

c++ // sdk/objc/native/src/audio/voice_processing_audio_unit.mm bool VoiceProcessingAudioUnit::Initialize(Float64 sample_rate) { RTC_DCHECK_GE(state_, kUninitialized); RTCLog(@"Initializing audio unit with sample rate: %f", sample_rate); OSStatus result = noErr; AudioStreamBasicDescription format = GetFormat(sample_rate); UInt32 size = sizeof(format); #if ! defined(NDEBUG) LogStreamDescription(format); #endif RTCAudioSessionConfiguration* webRTCConfiguration = [RTCAudioSessionConfiguration webRTCConfiguration]; if (webRTCConfiguration.isMicrophoneMute) { RTCLog("@Setting the format on the output scope of the input element/bus because it's not movie mode"); // Set the format on the output scope of the input element/bus. result = AudioUnitSetProperty(vpio_unit_, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, kInputBus, &format, size); if (result ! = noErr) { RTCLogError(@"Failed to set format on output scope of input bus. " "Error=%ld.", (long)result); return false; } } else { RTCLog("@NOT setting the format on the output sscope of the input element because it's movie mode"); } // Set the format on the input scope of the output element/bus. result = AudioUnitSetProperty(vpio_unit_, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, kOutputBus, &format, size); if (result ! = noErr) { RTCLogError(@"Failed to set format on input scope of output bus. " "Error=%ld.", (long)result); return false; } // Initialize the Voice Processing I/O unit instance. // Calls to AudioUnitInitialize() can fail if called back-to-back on // different ADM instances. The error message in this case is -66635 which is // undocumented. Tests have shown that calling AudioUnitInitialize a second // time, after a short sleep, avoids this issue. // See webrtc:5166 for details. int failed_initalize_attempts = 0; result = AudioUnitInitialize(vpio_unit_); while (result ! = noErr) { RTCLogError(@"Failed to initialize the Voice Processing I/O unit. " "Error=%ld.", (long)result); ++failed_initalize_attempts; if (failed_initalize_attempts == kMaxNumberOfAudioUnitInitializeAttempts) { // Max number of initialization attempts exceeded, hence abort. RTCLogError(@"Too many initialization attempts."); return false; } RTCLog(@"Pause 100ms and try audio unit initialization again..." ); [NSThread sleepForTimeInterval: 0.1 f]; result = AudioUnitInitialize(vpio_unit_); } if (result == noErr) { RTCLog(@"Voice Processing I/O unit is now initialized."); } // AGC should be enabled by default for Voice Processing I/O units but it is // checked below and enabled explicitly if needed. This scheme is used // to be absolutely sure that the AGC is enabled since we have seen cases // where only zeros are recorded and a disabled AGC could be one of the // reasons why it happens. int agc_was_enabled_by_default = 0;  UInt32 agc_is_enabled = 0; result = GetAGCState(vpio_unit_, &agc_is_enabled); if (result ! = noErr) { RTCLogError(@"Failed to get AGC state (1st attempt). " "Error=%ld.", (long)result); // Example of error code: kAudioUnitErr_NoConnection (-10876). // All error codes related to audio units are negative and are therefore // converted into a postive value to match the UMA APIs. RTC_HISTOGRAM_COUNTS_SPARSE_100000( "WebRTC.Audio.GetAGCStateErrorCode1", (-1) * result); } else if (agc_is_enabled) { // Remember that the AGC was enabled by default. Will be used in UMA. agc_was_enabled_by_default = 1; } else { // AGC was initially disabled => try to enable it explicitly. UInt32 enable_agc = 1; result = AudioUnitSetProperty(vpio_unit_, kAUVoiceIOProperty_VoiceProcessingEnableAGC, kAudioUnitScope_Global, kInputBus, &enable_agc, sizeof(enable_agc)); if (result ! = noErr) { RTCLogError(@"Failed to enable the built-in AGC. " "Error=%ld.", (long)result); RTC_HISTOGRAM_COUNTS_SPARSE_100000( "WebRTC.Audio.SetAGCStateErrorCode", (-1) * result); } result = GetAGCState(vpio_unit_, &agc_is_enabled); if (result ! = noErr) { RTCLogError(@"Failed to get AGC state (2nd attempt). " "Error=%ld.", (long)result); RTC_HISTOGRAM_COUNTS_SPARSE_100000( "WebRTC.Audio.GetAGCStateErrorCode2", (-1) * result); } } // Track if the built-in AGC was enabled by default (as it should) or not. RTC_HISTOGRAM_BOOLEAN("WebRTC.Audio.BuiltInAGCWasEnabledByDefault", agc_was_enabled_by_default); RTCLog(@"WebRTC.Audio.BuiltInAGCWasEnabledByDefault: %d", agc_was_enabled_by_default); // As a final step, add an UMA histogram for tracking the AGC state. // At this stage, the AGC should be enabled, and if it is not, more work is // needed to find out the root cause. RTC_HISTOGRAM_BOOLEAN("WebRTC.Audio.BuiltInAGCIsEnabled", agc_is_enabled); RTCLog(@"WebRTC.Audio.BuiltInAGCIsEnabled: %u", static_cast<unsigned int>(agc_is_enabled)); state_ = kInitialized; return true; }Copy the code

The above code uses an isMicrophoneMute variable to determine whether to enable input.

With the above code, we can initialize whether the microphone permission is required. However, it is far from enough to achieve dynamic linking and switching functions.

By our assumption, we need a way to switch to initialize the Audio Unit at any time. Analysis of the source code found that we can add another attribute isMicrophoneMute through RTCAudioSession.

This variable will provide an interface through RTCAudioSession, just like the isAudioEnabled property before it. We can easily do this by mimicking isAudioEnabled.

Implement the isMicrophoneMute property in RTCAudioSession.

Code examples:

// sdk/objc/components/audio/RTCAudioSession.mm
- (void)setIsMicrophoneMute:(BOOL)isMicrophoneMute {
  @synchronized(self) {
    if (_isMicrophoneMute == isMicrophoneMute) {
      return;
    }
    _isMicrophoneMute = isMicrophoneMute;
  }
  [self notifyDidChangeMicrophoneMute];
}

- (BOOL)isMicrophoneMute {
  @synchronized(self) {
    return_isMicrophoneMute; }} - (void)notifyDidChangeMicrophoneMute {
  for (auto delegate : self.delegates) {
    SEL sel = @selector(audioSession:didChangeMicrophoneMute:);
    if ([delegate respondsToSelector:sel]) {
      [delegate audioSession:self didChangeMicrophoneMute:self.isMicrophoneMute]; }}}Copy the code

Will be on RTCNativeAudioSessionDelegateAdapter setIsMicrophoneMute give AudioDeviceIOS messaging.

Code examples:

// sdk/objc/components/audio/RTCNativeAudioSessionDelegateAdapter.mm
- (void)audioSession:(RTCAudioSession *)session 
    didChangeMicrophoneMute:(BOOL)isMicrophoneMute {
  _observer->OnMicrophoneMuteChange(isMicrophoneMute);
}
Copy the code

For specific logic in the AudioDeviceIOS AudioDeviceIOS: : OnMicrophoneMuteChange sends a message to a thread to handle.

Code examples:

// sdk/objc/native/src/audio/audio_device_ios.mm
void AudioDeviceIOS::OnMicrophoneMuteChange(bool is_microphone_mute) {
  RTC_DCHECK(thread_);
  thread_->Post(RTC_FROM_HERE,
                this,
                kMessageTypeMicrophoneMuteChange,
                new rtc::TypedMessageData<bool>(is_microphone_mute));
}

void AudioDeviceIOS::OnMessage(rtc::Message* msg) {
  switch (msg->message_id) {
    // ...
    case kMessageTypeMicrophoneMuteChange: {
      rtc::TypedMessageData<bool>* data = static_cast<rtc::TypedMessageData<bool>*>(msg->pdata);
      HandleMicrophoneMuteChange(data->data());
      delete data;
      break; }}}void AudioDeviceIOS::HandleMicrophoneMuteChange(bool is_microphone_mute) {
  RTC_DCHECK_RUN_ON(&thread_checker_);
  RTCLog(@"Handling MicrophoneMute change to %d", is_microphone_mute);
  if (is_microphone_mute) {
            StopPlayout(a);InitRecording(a);StartRecording(a);StartPlayout(a); }else{
            StopRecording(a);StopPlayout(a);InitPlayout(a);StartPlayout();
        }
}
Copy the code

At this point, the mute of the microphone is complete.

IOS WebRTC subscription stream without access to the microphone solution

The cause of

why

The solution

Related Posts

IOS was written before custom cameras

IOS – Slider enclosed with navigation, fully supporting navigation adaptation

IOS basic exploration: LLVM compilation process