IOS Audio and Video (1) AVFoundation core class

IOS Audio and Video (2) AVFoundation video capture

IOS Audio and Video (3) AVFoundation playback and recording

IOS Audio and Video (43) AVFoundation Audio Session

IOS Audio Queue Services for AVFoundation

IOS Audio and Video (45) HTTPS self-signed certificates implement side play

IOS Audio and Video (46) Offline online speech recognition solution

IOS Audio and Video (3) AVFoundation playback and recording

  • As a refresher, the AVFoundation framework’s ability to capture video from a camera was explained in our last blog post, IOS Audio and Video (2) AVFoundation Video Capture. With two demos (one OC demo, one Swift demo) explained in detail AVFoundation processing camera video capture ability, can capture static pictures, can capture real-time video stream, can record video, but also provide interface operation flash, turn on the flashlight mode and so on. But these explanations are based on some interface on apple’s official document, learning these while we know how to call the apple interface implementation related function, but we are not aware of the original rational knowledge, subsequent blogs will be from the video acquisition, video coding, video decoding principle aspects in detail, because of time problem, It basically takes one day to finish a blog, so the progress is a little slow. The blog also refers to many great gods’ blogs, but these blogs are usually collected in Evernote, we may sometimes forget to add the original address, and we will make it up later.

  • This blog focuses on AVFoundation’s audio processing capabilities.

  • In terms of audio, we mainly refer to the two important capabilities of recording audio and playing audio. In the AVFoundation framework, etc., classes are provided for us to implement these functions easily. But we need to understand the fundamental knowledge, so that we can solve the problems in development in time.

  • Before we start talking about recording and playing audio, it is necessary to learn some theoretical knowledge about audio so that we can understand it better.

  • The audio Demo of this blog can be downloaded here: AVFoundation audio Demo Swift version, AVFoundation audio playback Demo OC version

1. Theoretical knowledge of audio

1.1 Physical properties of sound

  • Sound is a wave

How does sound come about?

  1. Sound is produced by the vibration of an object.

    As shown in the picture, when the ball hits the tuning fork, it vibrates, squeezing the air around it and producing sound. Sound is a pressure wave when playing Musical Instruments, slapping a door or on the desktop, the vibration can cause the vibration of air with rhythm, make the air density change, forming density and longitudinal wave splashing around (can be understood as a stone fell into the water), which creates a sound wave, this phenomenon will continue to vibration disappeared.

  • The three elements of sound waves:
  1. The three elements of sound wave are frequency, amplitude and waveform. Frequency represents the height of the musical scale, amplitude represents loudness, and waveform represents timbre.
  2. The higher the frequency (zero crossing rate), the shorter the wavelength. Low-frequency sounds have a longer wavelength, so they can more easily get around obstacles, so there is less energy decay and the sound travels farther, and vice versa.
  3. Loudness is a measure of energy, and a table will sound different when struck with different force. In everyday life, decibels are often used to describe the volume of loudness. Above a certain decibel level, the human ear can’t stand sound.

The human ear hearing has a frequency range, it’s about 20 hz to 20 KHZ, however, even in this frequency range, different frequency, the feeling of listening is different also, the industry is very famous loudness curve, is used to describe the loudness, the relationship between the sound pressure level and sound wave frequency under the condition of the human ear for 3 ~ 4 KHZ frequency range of sound is sensitive, At lower or higher frequencies, sensitivity is reduced; At low SPL, the frequency characteristics of hearing will be very uneven. At higher SPL, the frequency characteristics of hearing become more uniform. For music with a wide frequency range, the sound pressure of 80 ~ 90dB is the best, exceeding 90dB will damage the human ear (105dB human ear limit).

  • The medium of sound

Guitar is made by the player plucking the strings, drum is made by the drumstick hitting the surface of the drum, these sounds are inseparable from vibration, even we speak because of vocal cord vibration and sound. If it’s all vibration, why do guitars, drums and vocals sound so different? That’s because the medium is different. After the vibration of our vocal cords produces sound, it is reflected in the oral cavity, cranial cavity and other local areas, and then transmitted to the ears of others through the air. This is the process that our words are heard by others, including the initial sound medium and cranial cavity, oral cavity, as well as the transmission medium in the middle. In fact, sound has a wide range of media. It can travel through air, liquids and solids. And the medium is different, the propagation speed is also different, for example, the sound in the air propagation speed of 340m/s, in distilled water propagation speed of 1497m/s, and in the iron rod propagation speed can be as high as 5200m/s; Sound, however, cannot travel in a vacuum.

  • Principle of sound absorption and sound insulation
  1. Sound-absorbing material can attenuate the reflected energy of the incident sound source, so as to achieve the fidelity effect of the original sound source. For example, sound-absorbing cotton material will be used on the walls of the recording studio.
  2. Sound insulation is mainly to solve the transmission of sound and reduce the noise in the main space, sound insulation cotton material can attenuate the transmission energy of the incident sound source, so as to achieve the quiet state of the main space, such as KTV inside the wall will be installed sound insulation cotton material.
  • echo

We often hear an echo when we shout loudly over a mountain or open field. The reason for the echo is that the sound hits an obstacle and bounces back to be heard again. But, if the time difference between two kinds of sound to our ears in less than 80 milliseconds, we cannot separate the two, in fact, in our daily life to the human ear also in collecting echo, just because of the noisy environment and echo of decibels (measures the size of the sound energy units) is lower, so we could not tell the difference between such noises in my ears, Or the brain can pick it up but it can’t tell.

  • resonance

There is light energy and water energy in nature, mechanical energy and electrical energy in life. In fact, sound can also produce energy. For example, when two objects with the same frequency are struck, the other object will also vibrate and sound. This phenomenon is called resonance, and resonance proves that sound propagation can cause another object to vibrate, that is, sound propagation is also a kind of energy propagation process.

1.2 Digital Audio

1.2.1 Sampling, quantization and coding

  • To digitize an analog signal, three processes are needed: sampling, quantization and coding.

  • First, the analog signal is sampled, which is digitized on the time axis. According to Nyquist’s theorem (also known as the sampling theorem), the sound is sampled at frequencies more than 2 times higher than the highest frequency of the sound (also known as AD conversion).

  • For high-quality audio signals, the frequency range (that can be heard by human ear) is 20Hz ~ 20kHz, so the sampling frequency is generally 44.1khz, which can ensure that the sampled sound can be digitized even up to 20kHz, so that the sound quality heard by human ear will not be reduced after digital processing. The so-called 44.1khz means that it will sample 44100 times in 1 second.

  • Now, how do we represent each sample?

  • That’s quantization.

Quantization refers to the digitalization of signals on the amplitude axis. For example, 16-bit binary signal is used to represent a sampling of sound. The range represented by 16-bit (a short) is [-32768, 32767], with 65536 possible values in total. As shown below:

  • Since each quantization is a sample, how can so many samples be stored?

  • And that requires – coding. The so-called encoding is to record the sampled and quantized digital data in a certain format, such as sequential storage or compressed storage, and so on.

  • There are many kinds of formats involved in this, usually said audio naked data format is Pulse Code Modulation (PCM) data. To describe PCM data, the following concepts are required: sampleFormat, sampleRate, and channel. Take CD sound quality as an example: the quantization format (sometimes described as bit depth) is 16 bits (2 bytes), the sampling rate is 44100, and the number of tracks is 2. This information describes CD sound quality.

  • In the case of sound formats, there is another concept to describe their size. It is called data bit rate, which is the number of bits per second. It is used to measure the volume of audio data per unit of time.

  • For CD sound quality data, what is the bit rate?

Calculate as follows: 44100 * 16 * 2 = […]

  • So how much storage does this type of CD sound quality take up in a minute?
  1. The calculation is as follows: 1378.125 x 60/8/1024 = 10.09MB
  2. If the sampleFormat is more precise (4 bytes to describe a sample, for example) or the sampleRate is more dense (48kHz to sampleRate), it will take up more storage and be able to describe sound details with greater precision. The stored binary data means that the analog signal has been converted into a digital signal, and then the binary data can be stored, played, copied, or any other operation.
  • There must be some friends who have questions, how does the microphone collect sound?

There’s a carbon film inside the microphone, very thin and very sensitive. Described above, sound is a longitudinal wave, compressed air will compress the carbon film, carbon film will vibrate when squeezed, in the bottom of the carbon film is an electrode, carbon film at the time of vibration will contact electrode, the length of contact time and the frequency and the vibration amplitude and frequency of the sound waves, this completes the voice signal into electrical signal conversion. After amplification circuit processing, you can implement the following sampling quantization processing.

  • So what is decibel?

The decibel is a unit used to indicate the intensity of sound. In daily life, if the sound pressure value is expressed, the range of its change is very large, which can reach more than six orders of magnitude. At the same time, the response of our ears to the strong and weak stimulation of sound signal is not linear, but in logarithmic proportion, so the concept of decibel is introduced to express the acoustic value. The decibel is the logarithm base 10 of the ratio of two identical quantities (for example, A1 and A0) multiplied by 10 (or 20), that is: N= 10 * LG (A1 / A0) decibel symbol is “dB”, it is dimensionless. Where A0 is the reference quantity (or reference quantity), A1 is the measured quantity.

1.2.2 Audio coding

  • Mentioned the CD quality data sampling format, used to calculate the per minute storage space is about 10.1 MB, if it is only the stored in the storage device (CD, hard disk), may be acceptable, but if you want to in the network real-time online communication, then the amount of data may be too big, so must be compressed coding.
  • One of the basic criteria for compression coding is the compression ratio, which is usually less than 1 (otherwise there is no need to do compression because compression is to reduce the data capacity).
  • Compression algorithmincludingLossy compressionandLossless compression.
  1. Lossless compression means that the decompressed data can be completely restored. Lossy compression is the most commonly used compression format.
  2. Lossy compression means that the decompressed data cannot be completely restored and some information will be lost. The smaller the compression ratio is, the more information will be lost and the greater the distortion will be after the signal is restored. You can select different compression algorithms, such as PCM, WAV, AAC, MP3, and Ogg, according to different application scenarios (including storage devices, transmission networks, and playback devices).
  • Principle of compression coding

The principle of compression coding: in fact, it is to compress redundant signals. Redundant signals refer to signals that cannot be perceived by human ears, including audio signals outside the auditory range of human ears and audio signals that are masked. The audio signals outside the range of the human ear have already been mentioned, so I will not repeat them here. However, the masked audio signal is mainly due to the masking effect of human ear, which is mainly manifested as the masking effect in frequency domain and the masking effect in time domain. Whether in time domain or frequency domain, the masked audio signal is considered as redundant information and is not encoded.

  • What are the formats of compression coding?

There are: WAV coding, MP3 coding, AAC coding, Ogg coding.

  • WAV code:
  1. PCMPulse Code Modulation stands for Pulse Code Modulation. The general workflow of PCM has been described before, and one implementation of WAV encoding (there are several implementations, but none of them compress) is to add 44 bytes to the front of the PCM data format, which is described separatelyPCMThe sampling rate, number of channels, data format and other information.
  2. Features: Very good sound quality, a large number of software support.
  3. Application: multimedia development of intermediate files, save music and sound materials.
  • MP3 encoding:
  1. MP3With good compression ratio, useLAMEEncoding (MP3An implementation of the encoding format) of the high bit rateMP3The file sounds very similar to the source WAV file, of course, the appropriate parameters should be adjusted for different application scenarios to achieve the best results.
  2. Features: sound quality in 128Kbit/s above performance is good, high compression ratio, a large number of software and hardware support, good compatibility.
  3. Application: music appreciation with high bit rate and requirement for compatibility.
  • AAC encoding
  1. AACIs a new generation of audio lossy compression technology, it through some additional coding technology (such as PS, SBR, etc.), derivedLC-AAC,HE-AAC,HE-AAC v2There are three main encoding formats. Lc-aac is a traditional AAC. Relatively speaking, it is mainly applied to the encoding of medium-high bit rate scenarios (≥80Kbit/s). He-aac (equivalent to AAC+SBR) is mainly used for coding in medium and low bit rate scenarios (≤80Kbit/s). The newly introduced HE-AAC V2 (equivalent to AAC+SBR+PS) is mainly used for coding in low bit rate scenarios (≤48Kbit/s). In fact, most encoders are set to ≤48Kbit/s automatically enable PS technology, and >48Kbit/s does not add PS, equivalent to ordinary he-aac.
  2. Features: excellent performance at bit rate less than 128Kbit/s, and mostly used for audio encoding in video.
  3. Application: 128Kbit/s below the audio encoding, mostly used for video audio track encoding.
  • Ogg code:
  1. OggIs a very potential encoding, has a good performance in all kinds of bit rate, especially in low and medium bit rate scenarios.OggIn addition to the sound quality, it is completely free, which isOggThe groundwork was laid for more support.OggWith great algorithms that can achieve better sound quality at smaller bit rates, a 128Kbit/s Ogg is better than an MP3 at 192Kbit/s or more. But at present because there is no media service software support, so digital broadcast based on Ogg can not be realized.OggThe current support situation is not good enough, either in software or hardware support, is not comparable to MP3.
  2. Features: can use smaller bit rate than MP3 to achieve better sound quality than MP3, high, medium and low bit rate have good performance, compatibility is not good, streaming features are not supported.
  3. Application scenario: Audio message scenario in voice chat.

1.3 Audio codec

  • The AVFoundation framework can support any Audio codec supported by the Core Audio framework, which means that AVFoundation can support a large number of resources in different formats. However, in the case of not using linear PCM audio, more only AAC can be used.

  • AAC is an audio processing method corresponding to H.264 standard, which has become the most mainstream encoding method in audio stream and download. This format is a significant improvement over THE MP3 format and can provide higher quality audio at lower bit rates, making it ideal for publishing and disseminating audio on the Web. In addition, AAC has no restrictions from certificates and licenses, which have been criticized in the MP3 format.

  • Both the AVFoundation and Core Audio frameworks provide support for decoding MP3 data, but not for encoding it.

  • The AVFoundation framework started out as an audio-only framework. Its predecessor, introduced in IOS2.2, only included a class for audio playback. In iOS 3.0, Apple added audio recording capabilities. Although these classes are by far the oldest in the framework, they are still the most commonly used.

  • With all this audio theory covered, we will start with AVAudioPlayer and AVAudioRecorder, the two basic classes of THE AVFoundation.

2. Play the audio

2.1 AVAudioPlayer profile

  • There are many classes that can play audio in IOS. Here we will mainly explain how to use the AVAudioPlayer class to play audio.
  • Here’s a brief introduction to the AVAudioPlayer class
  • AVAudioPlayer class has been supported since IOS2.2. It is an audio player that provides playback of audio data from files or storage. Use this class for audio playback unless you are playing audio captured from a network stream or require very low I/O latency.
  • AVAudioPlayer inherits from NSObject
  • AVAudioPlayer provides the following features:
  1. Play any sound that lasts
  2. Plays sounds from a file or memory buffer
  3. Loop for
  4. Play multiple sounds at the same time, one sound per audio player, precise synchronization
  5. Controls relative playback levels, stereo positioning, and playback speed
  6. Find a specific point in a sound file that supports application features such as fast forward and fast back.
  7. Gets data that can be used for playback level measurements.
  • The AVAudioPlayer class allows you to play sounds in any audio format in iOS and macOS. You can implement a delegate to handle interrupts (such as incoming calls on iOS) and update the user interface when the sound finishes playing. Delegate methods are described in AVAudioPlayerDelegate. This class uses objective-C declared properties to manage information about the sound, such as player points in the sound timeline, and access playback options, such as volume and looping.
  • To configure an appropriate audio session for playback on iOS, see AVAudioSession and AVAudioSessionDelegate.
  • AVAudioPlayerDelegate mainly provides two proxy callback methods:

(1) When the audio playback is complete, the following callback method will be called:

optional func audioPlayerDidFinishPlaying(_ player: AVAudioPlayer, 
                             successfully flag: Bool)
Copy the code

(2) The following callback method will be called when the audio player encounters a decoding error during playback:

optional func audioPlayerDecodeErrorDidOccur(_ player: AVAudioPlayer, 
                                       error: Error?)
Copy the code
  • To play, pause, or stop an audio player, one of its playback control methods can be called using these interfaces:


// Play sounds asynchronously.
func play(a) -> Bool

// Plays sounds asynchronously, starting from a specified point in the audio output device timeline.
func play(atTime: TimeInterval) -> Bool


// Pause playback; The sound is ready to resume playing from where it left off.
func pause(a)

// Stop playback and undo the Settings required for playback.
func stop(a)

// Prepare the audio player for playback by preloading the buffer.
func prepareToPlay(a) -> Bool

// Fade into a new volume at a specific duration.
func setVolume(Float, fadeDuration: TimeInterval)

// A Boolean value indicating whether the audio player is playing (true) or not (false).
var isPlaying: Bool

// The audio player's playback volume, linear range from 0.0 to 1.0.
var volume: Float

// Stereo panning position of audio player.
var pan: Float

// Audio player playback rate.
var rate: Float

// A Boolean value that specifies whether to enable playback rate adjustment for the audio player.
var enableRate: Bool

// The number of times a sound returns to the beginning, and when it reaches the end, it is repeated.
var numberOfLoops: Int

// Audio player delegate object.
var delegate: AVAudioPlayerDelegate?

// A protocol that allows a delegate to respond to audio interrupts and audio decoding errors and complete playback of the sound.
protocol AVAudioPlayerDelegate// Audio player Settings dictionary, containing player-related sound information.var settings: [String : Any]
Copy the code
  • In addition, AVAudioPlayer provides an interface for managing information about sounds:

// The number of audio channels in the sound associated with the audio player.
var numberOfChannels: Int

/ / that is associated with an audio player AVAudioSessionChannelDescription object array
var channelAssignments: [AVAudioSessionChannelDescription]?

// The total duration in seconds of the sounds associated with the audio player.
var duration: TimeInterval

// Player point, in seconds, in the timeline of the sound associated with the audio player.
var currentTime: TimeInterval

// Time value of the audio output device, in seconds.
var deviceCurrentTime: TimeInterval

// The URL of the sound associated with the audio player.
var url: URL?

// The data object that contains the sounds associated with the audio player.
var data: Data?

// UID of the current audio player.
var currentDevice: String?

// The format of the audio in the buffer.
var format: AVAudioFormat
Copy the code
  • In addition, AVAudioPlayer provides an interface for measuring audio levels:

// A Boolean value that specifies the audio level measurement on/off status of the audio player.
var isMeteringEnabled: Bool

// Returns the average power in decibels for a given channel.
func averagePower(forChannel: Int) -> Float

// Returns the peak power of a given channel, expressed in decibels as the sound being played.
func peakPower(forChannel: Int) -> Float

// Returns the average and peak power values for refreshing all channels of the audio player.
func updateMeters(a)
Copy the code
  • This applies to all Audio formats handled by the AVAudioPlayer and AVAudioRecorder classes. As follows:

// Format identifier.
let AVFormatIDKey: String

// The sampling rate, expressed in Hertz, is an NSNumber floating point value. It's 8000 and 16K
let AVSampleRateKey: String

NSNumber specifies the number of channels.
let AVNumberOfChannelsKey: String
Copy the code

2.2 AVAudioPlayer implements audio playback

  • AVAudioPlayer is built on the top layer of C-based Audio Queue Services of Core Audio. Its limitations lie in that it cannot play Audio from network streams, access original Audio samples, and meet very low latency.

2.2.1 create AVAudioPlayer

  • You can create an AVAudioPlayer either through NSData or NSURL of a local audio file.
    NSURL *fileUrl = [[NSBundle mainBundle] URLForResource: @"rock" withExtension:@"mp3"];
    self.player = [[AVAudioPlayer alloc] initWithContentsOfURL:fileUrl error:nil];
    if (self.player) {
        [self.player prepareToPlay];
    }
Copy the code

PrepareToPlay after creating AVAudioPlayer, it is recommended to call the prepareToPlay method. This method will get the required Audio hardware and preload the buffer of the Audio Queue. Of course, if it is not called actively, it will be called by default when executing the Play method, but it will cause a slight delay in playback.

2.2.2 Control playback

AVAudioPlayer play can play audio, and both Stop and Pause can pause playback, but stop will undo the Settings made by calling prepareToPlay. You can see how to set this up from the AVAudioPlayer properties described above. Specific Settings are as follows:

  1. Modify player volume: Player volume is independent of system volume, and the volume or playback gain is defined as a floating point value between 0.0 (mute) and 1.0 (maximum volume)
  2. Change the player’s PAN value: Allows sound to be played in stereo, from -1.0 (extreme left) to 1.0 (extreme right), default is 0.0 (center)
  3. Adjust playback rate: 0.5 (half speed) to 2.0 (2x speed)
  4. NumberOfLoops set numberOfLoops to create seamless loops: -1 for infinite loops (audio loops can be uncompressed linear PCM or compressed audio like AAC; loops are not recommended for MP3)
  5. Audio metering: Reads the average and peak volume forces from the player as playback occurs

2.2.3 Playing/stopping audio

  • Playback of audio
        NSTimeInterval delayTime = [self.players[0] deviceCurrentTime] + 0.01;
        for (AVAudioPlayer *player in self.players) {
            [player playAtTime:delayTime];
        }
        self.playing = YES;
Copy the code

For multiple audio that need to be played, if you want the effect to be played synchronously, you need to capture the current device time and add a small delay to have a reference time calculated from the start time. DeviveCurrentTime is the time value of an audio device that is independent of system events. DeviveCurrentTime increases monotonically when more than one audioPlayer is in the Play or pause state. DeviveCurrentTime is set to 0 when the audioPlayer is not in the play or pause state. The playAtTime parameter must be based on deviveCurrentTime and greater than or equal to deviveCurrentTime.

  • Stop playing audio
        for (AVAudioPlayer *player in self.players) {
            [player stop];
            player.currentTime = 0.0f;
        }
Copy the code

When pausing, you need to set currentTime of the audioPlayer to 0.0. This value is used to identify the offset of the current playing position when the audio is being played, and the starting offset for the audio to be replayed when the audio is not playing.

  • callplayMethod can realize the function of playing audio immediately,pauseMethod can be played on pause, so imaginestopMethod to stop the playback behavior. And the interesting thing is,pauseandstopMethod on the outside of the application all it does is stop the current playback behavior. Next time we’re going to callplayBy means ofpauseandstopMethod to stop audio will continue to play.
  • The main difference between the two is in the underlying processing. Call the stop method to undo the callprepareToPlayThe pause method does not.

2.2.4 Modifying the Volume and Playback rate

  • We can simply set the properties of the AVAudioPlayer object to change the volume, playback rate, whether to loop, etc., as follows:
player.enableRate = YES;
player.rate = rate;
player.volume = volume;
player.pan = pan;
player.numberOfLoops = -1;
Copy the code
  • Modify the volume of the player: The volume of the player is independent of the volume of the system, and there are many interesting effects that can be achieved by handling the volume of the player, such as fading out. The volume or playback gain is defined as a floating point value between 0.0 (mute) and 1.0 (maximum volume).
  • Modify the player’s PAN value: To allow sound to be played in stereo, the player’s PAN value is represented by a floating point number ranging from -1.0 (extreme left) to 1.0 (extreme right). Default is 0.0 (center)
  • Adjust playback rate: a powerful feature has been added to IOS5 that allows users to adjust playback rate without changing pitch, ranging from 0.5 (half speed) to 2.0 (2x speed). If you’re recording a complex piece of music or speech, it helps to slow down. When we want to quickly scan a regular government meeting, speeding it up is helpful.
  • Use the numberOfLoops property to loop audio seamlessly: Use a number greater than 0 to loop the audio player n times. Instead, assigning -1 to this property causes the player to loop indefinitely.

2.2.5 Configuring An Audio Session

  • Since audio sessions are common to all applications, they are typically set up at program startup through AVAudioSession singletons.

  • If you want the application block when play audio mute switch action, you need to set up the session classified as AVAudioSessionCategoryPlayback, but if you want to press the lock screen after also can play, Add an array of Required Background modes to the plist and add App Plays Audio or Streams Audio /video using AirPlay.

  • Configuring an audio session will be explained in detail later in the recording section.

2.2.6 Handling Interrupt Events

  • Interrupt events refer to the incoming call, alarm, FaceTime, etc. When the interrupt event occurs, the system will call the following methods in the AVAudioPlayerDelegate of the AvAudioPlayerPlayer:
- (void)audioPlayerBeginInterruption:(AVAudioPlayer *)player NS_DEPRECATED_IOS(2 _2.8 _0);
- (void)audioPlayerEndInterruption:(AVAudioPlayer *)player withOptions:(NSUInteger)flags NS_DEPRECATED_IOS(6 _0.8 _0);
Copy the code

Interrupt will end the method called into a options parameter, if it is AVAudioSessionInterruptionOptionShouldResume suggests that can restore the playback of audio.

In preparing for the interruption time before taking action, first to get the interrupt notification, registration application AVAudioSession AVAudioSessionInterruptionNofication sent notice.

override init() {
        super.init(a)let nc = NotificationCenter.default

        nc.addObserver(self, selector: #selector(handleInterruption(_:)), name: AVAudioSession.interruptionNotification, object: AVAudioSession.sharedInstance())
        nc.addObserver(self, selector: #selector(handleRouteChange(_:)), name: AVAudioSession.interruptionNotification, object: AVAudioSession.sharedInstance())
    }
Copy the code

Push notifications will contain a userInfo dictionary with a lot of important information that can be used to determine the appropriate actions to take. The following code:

@objc func handleInterruption(_ notification: Notification) {
        if let info = (notification as NSNotification).userInfo {
            let type = info[AVAudioSessionInterruptionTypeKey] as! AVAudioSession.InterruptionType
            iftype == .began { stop() delegate? .playbackStopped() }else {
                let options = info[AVAudioSessionInterruptionOptionKey] as! AVAudioSession.InterruptionOptions
                ifoptions == .shouldResume { play() delegate? .playbackBegan() } } } }Copy the code

In handleInterrupation approach, first by determining AVAudioSessionInterrupationTypeKey value interrupt type (type), we call the stop method, The delegate is notified of the interrupt status by calling the delegate function playbackStopped method. It is important to note that when the notification is received, the audio session has been terminated and the AVAudioPlayer instance is suspended. Calling control to start the stop method only updates the internal state and does not stop the playback.

2.2.7 Handling Line changes

  • Adding or removing audio input and output lines to and from iOS devices can cause cable changes for multiple reasons, such as users plugging in headphones or USB microphones in their shorts. When these events occur, audio changes the input or output lines based on these conditions, and AVFoundation broadcasts a notification describing the change to all relevant listeners.
  • As a best practice, the playback action should not change when the headset is plugged in, but should be paused when the headset is unplugged.
  • To handle line changes, we can only do this through system notifications.
  • First you need to listen for notifications with the following code:
        NSNotificationCenter *nsnc = [NSNotificationCenter defaultCenter];
        [nsnc addObserver:self
                 selector:@selector(handleRouteChange:)
                     name:AVAudioSessionRouteChangeNotification
                   object:[AVAudioSession sharedInstance]];
Copy the code
  • Then determine that the event is that the old device is unreachable, take out the description of the old device, determine whether the old device is a headset, and then pause the playback process. The code is as follows:
- (void)handleRouteChange:(NSNotification *)notification {

    NSDictionary *info = notification.userInfo;

    AVAudioSessionRouteChangeReason reason =
        [info[AVAudioSessionRouteChangeReasonKey] unsignedIntValue];

    if (reason == AVAudioSessionRouteChangeReasonOldDeviceUnavailable) {

        AVAudioSessionRouteDescription *previousRoute =
            info[AVAudioSessionRouteChangePreviousRouteKey];

        AVAudioSessionPortDescription *previousOutput = previousRoute.outputs[0];
        NSString *portType = previousOutput.portType;

        if ([portType isEqualToString:AVAudioSessionPortHeadphones{[])self stop];
            [self.delegate playbackStopped]; }}}Copy the code

The first thing to do after receiving a notification is to determine why the line change occurred. Check the save said reason AVAudioSessionRouteChangeReasonKey values in the userinfo dictionary. The return value is an unsigned integer that indicates the reason for the change. Different events can be inferred from causes. Access or change the audio session types such as new equipment, but we need special attention is the headset shorts this event, the event of the corresponding reasons for: AVAudioSessionRouteChangeReasonOldDeviceUnavailable

Know how disconnected equipment, need to request the userinfo dictionary, for which is used to describe the previous line AVAudioSessionPortDescription. The line description information is integrated into an acquaintance NSArray and an output NSArray. In this case, you need to find the first output port in the line description and determine whether it is a headphone jack. If so, it stops playing and calls the delegate function’s playbackStopeed method.

Here AVAudioSessionPortHeadphones contains only a wired headset, wireless bluetooth headset need to judge AVAudioSessionPortBluetoothA2DP value.

2.2.8 Audio Playback processing

2.2.8.1 Playing local Audio

  • We can use AVAudioPlayer to play local music, whereas we need to use AVAudioPlayer to play remote audio
#import "ViewController.h"
#import <AVFoundation/AVFoundation.h>

@interface ViewController ()
@property (nonatomic,strong)AVAudioPlayer *player;
@end

@implementation ViewController- (AVAudioPlayer *)player{
    if (_player == nil) {
        //1
        NSURL *url = [[NSBundle mainBundle]URLForResource: @"235319.mp3" withExtension:nil];
        2. Create an AVAudioPlayer object
        _player = [[AVAudioPlayer alloc]initWithContentsOfURL:url error:nil];
        //3. Prepare to play (buffering to improve playback fluency)
        [_player prepareToPlay];
    }
    return _player;
}
// Play (asynchronous play)
- (IBAction)play {
    [self.player play];
}
// Pause the music and start again
- (IBAction)pause {
    [self.player pause];
}
// Stop the music, stop and start again
- (IBAction)stop {
    [self.player stop];
    // This should be empty
    self.player = nil;
}  
@end
Copy the code
  • Instead, we play local short audio (” short audio “means usually 1 to 2 seconds in the program) directlyAudioServicesCreateSystemSoundID(url, &_soundID);Ok, that’s the least cost.
#import "ViewController.h"
#import <AVFoundation/AVFoundation.h>

@interface ViewController ()
@property (nonatomic,assign)SystemSoundID soundID;
@end

@implementation ViewController- (SystemSoundID)soundID{
    if (_soundID == 0) {
        / / generates sound id
        CFURLRef url = (__bridge CFURLRef) [[NSBundle mainBundle]URLForResource: @"buyao.wav" withExtension:nil];
        AudioServicesCreateSystemSoundID(url, &_soundID);
    }
    return _soundID;
}

- (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event{
    // Play sound effects
    AudioServicesPlaySystemSound(self.soundID);// No vibration effect
    / / AudioServicesPlayAlertSound (< # SystemSoundID inSystemSoundID# >) / / belt vibration effect
}

@end
Copy the code

2.2.8.2 Playing remote Audio

  • AVPlayer allows you to play both local and remote (over the network) music

  • The OC code for playing the audio stream is as follows:

@interface ViewController ()
@property (nonatomic,strong)AVPlayer *player;
@end

@implementation ViewController

-(void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event{
    // Play music
    [self.player play]; } #pragma mark -(AVPlayer *)player{
    if (_player == nil) {
            
    // To play remote music, just change the URL to network music
    //NSURL *url = [NSURL URLWithString:@"http://cc.stream.qqmusic.qq.com/C100003j8IiV1X8Oaw.m4a?fromtag=52"];

    //1. Local music resources
    NSURL *url = [[NSBundle mainBundle]URLForResource: @"235319.mp3" withExtension:nil];

    //2. The URL set by this method cannot be dynamically switched
    _player = [AVPlayer playerWithURL:url];

    //2.0 Creates a playerItem that can be sliced by changing the playerItem
    //AVPlayerItem *playerItem = [AVPlayerItem playerItemWithURL:url];
    //2.1 This method can change the URL dynamically
    //_player = [AVPlayer playerWithPlayerItem:playerItem];
    
    //AVPlayerItem *nextItem = [AVPlayerItem playerItemWithURL:nil];
    / / by replaceCurrentItemWithPlayerItem: method to replace the url, to cut the song
    //[self.player replaceCurrentItemWithPlayerItem:nextItem];
    
    }
    return _player;
}
@end
Copy the code
  • The Swift code for playing the audio stream is as follows:
// Initialize the audio playback and return the audio duration
// Player related
var playerItem:AVPlayerItem!
var audioPlayer:AVPlayer!

var audioUrl:String = "" {
    didSet{
        self.setupPlayerItem()
    }
} Audio / / url

func initPlay(a) {
    // Initialize the player
    audioPlayer = AVPlayer(a)// Listen to the end of audio playback
    NotificationCenter.default.addObserver(self, selector: #selector(playItemDidReachEnd), name: NSNotification.Name.AVPlayerItemDidPlayToEndTime, object: AudioRecordManager.shared().playerItem)
    
}

// Set the resource
private func setupPlayerItem(a) {
    guard let url = URL(string: audioUrl) else {
        return
    }
    self.playerItem = AVPlayerItem(url: url)
    self.audioPlayer.replaceCurrentItem(with: playerItem)
}

// Get the audio duration
func getDuration(a) -> Float64 {
    if AudioRecordManager.shared().playerItem == nil {
        return 0.0
    }
    let duration : CMTime= playerItem! .asset.durationlet seconds : Float64 = CMTimeGetSeconds(duration)
    return seconds
}
func getCurrentTime(a) -> Float64 {
    if AudioRecordManager.shared().playerItem == nil {
        return 0.0
    }
    let duration : CMTime= playerItem! .currentTime()let seconds : Float64 = CMTimeGetSeconds(duration)
    return seconds
}

// End of playback
var audioPlayEndBlock:(()->())?
func playItemDidReachEnd(notifacation:NSNotification){ audioPlayer? .seek(to: kCMTimeZero)if let block = audioPlayEndBlock {
        block()
    }
}

/ / play
func playAudio(a) {
    ifaudioPlayer ! =nil{ audioPlayer? .play() } }/ / pause
var audioStopBlock:(()->())?
func stopAudio(a) {
    ifaudioPlayer ! =nil{ audioPlayer? .pause()if let block = audioStopBlock {
            block()
        }
    }
}

/ / destroy
func destroyPlayer(a) {
    if AudioRecordManager.shared().playerItem ! =nil {
        AudioRecordManager.shared().audioPlayer? .pause()AudioRecordManager.shared().playerItem? .cancelPendingSeeks()AudioRecordManager.shared().playerItem? .asset.cancelLoading() } }Copy the code

3. Record audio

3.1 AVAudioRecorder profile

  • AVAudioRecorder is a class provided by the AVFoundation framework that provides audio recording functionality in applications. It also directly inherits NSObject and was only supported in IOS3.0.
class AVAudioRecorder : NSObject
Copy the code
  • AVAudioRecorder provides the following features:
  1. Continue recording until the user stops
  2. The specified duration of a recording
  3. Pause and resume recording
  4. Gets input sound level data that can be used to provide level measurements
  • In iOS, recorded audio comes from a device connected to the user’s built-in microphone or headphone microphone. In macOS, audio comes from the system’s default audio input device, set by the user in system preferences.

  • You can implement a delegate object for an audio logger to respond to audio interrupts and audio decoding errors and complete recording.

  • To configure recording, including options such as bit depth, bit rate, and sampling rate conversion quality, configure the Settings dictionary for the audio logger. Use the Settings key described in Settings.

var settings: [String : Any] { get }
Copy the code
  • The recorder setting is only effective after the prepareToRecord() method is explicitly called or implicitly called by starting recording. Audio Settings keys are described in Audio Settings and Formats.
  • The Setting dictionary contains the following keys:

// A Boolean value indicating whether the recorder is recording.
var isRecording: Bool

// The URL of the audio file associated with the recorder.
var url: URL

/ / that is associated with recorder AVAudioSessionChannelDescription object array.
var channelAssignments: [AVAudioSessionChannelDescription]?

// Time, in seconds, since the recording started.
var currentTime: TimeInterval

// The time (in seconds) of the host device where the audio recorder is located.
var deviceCurrentTime: TimeInterval

// The format of the audio in the buffer.
var format: AVAudioFormat
Copy the code

3.2 AVAudioSession profile

Audio sessions act as a middleman between the application and the operating system. It provides a simple and useful way for the OS to know how an application should interact with the IOS audio environment. You don’t need to know the details of how you interact with the audio hardware, just a semantic description of how your application behaves. This allows you to specify the general audio behavior of your application and delegate the management of that behavior to the audio session so that the OS can best manage the user experience with audio.

  • All IOS applications have audio sessions, whether they are used or not. The default audio session comes from the following pre-configuration:
  1. Audio playback is enabled, but audio recording is not.
  2. All audio played by the application disappears when the user toggles the ringer/Mute mode.
  3. When the device displays the unlock screen, all audio played in the background is muted.
  4. When the app plays audio, all background audio is muted.
  • To enable recording, you need to configure the audio session, because the system is in Solo Ambient mode by default, and you need to set it to Play and Record mode for recording.
  • To configure an appropriate recording session, see AVAudioSession and AVAudioSessionDelegate.
  • To implement recording, it is necessary to understand the AVAudioSession class, which also inherits NSObject directly
class AVAudioSession : NSObject
Copy the code
  • The audio session acts as an intermediary between the application and the operating system and, in turn, between the underlying audio hardware. You use an audio session to communicate the general nature of application audio to the operating system without specifying specific behavior or the interaction required with the audio hardware. You delegate the management of these details to the audio session to ensure that the operating system can best manage the user’s audio experience.
  • All iOS, tvOS, and watchOS apps have a default audio session, pre-configured with the following behavior:
  1. It supports audio playback, but does not allow audio recording (tvOS does not support audio recording).
  2. In iOS, setting the ringer/Mute switch to silent mode will mute any audio played by the app.
  3. On iOS, locking the device silences the app’s audio.
  4. When the app plays the audio, it will mute any other background audio.

3.2.1 Audio Session Mode

  • While the default audio session provides useful behavior, it generally does not provide the audio behavior required by media applications. To change the default behavior, you need to configure the audio session category for your application.
  • You can use seven possible categories (see Audio session categories and modes), but playback is the most common one for playback applications. This category indicates that audio playback is a core feature of your application. When you specify this category, your app’s audio will continue with the ringtone/Mute switch set to silent mode (iOS only). Using this category, you can also play background audio if you use audio, AirPlay and pictures in Picture background mode. For more information, see Enabling Background Audio.
  • The 7 categories of audio session behavior are as follows:
category Call mute or lock screen mute Interrupts audio for non-hybrid applications Allows audio input (recording) and output (playback) role
AVAudioSessionCategoryAmbient Yes No Output only Games, productivity apps
AVAudioSessionCategorySoloAmbient (default) Yes Yes Output only Games, productivity apps
AVAudioSessionCategoryPlayback No Yes by default; no by using override switch Output only Audio and video player
AVAudioSessionCategoryRecord No (Continue recording after locking the screen) Yes Input only Recorder, audio capture
AVAudioSessionCategoryPlayAndRecord No Yes by default; no by using override switch Input and output VoIP, voice chat
AVAudioSessionCategoryMultiRoute No Yes Input and output Use external advanced A/V applications

Note: In order for your application to continue playing audio when the ringer/Mute switch is set to silent and the screen is locked, make sure that the UIBackgroundModes audio keys have been added to your application’s messages. File. This requirement is except that you use the correct category.

  • Models and related categories:
Pattern identifier Compatible categories role
AVAudioSessionModeDefault All Default audio session mode
AVAudioSessionModeMoviePlayback AVAudioSessionCategoryPlayback Specify this mode if your application is playing movie content
AVAudioSessionModeVideoRecording AVAudioSessionCategoryPlayAndRecord, AVAudioSessionCategoryRecord Select this mode if the application is recording a movie
AVAudioSessionModeVoiceChat AVAudioSessionCategoryPlayAndRecord Select this mode if the application needs to perform bidirectional voice communication, such as VoIP
AVAudioSessionModeGameChat AVAudioSessionCategoryPlayAndRecord This mode is provided by Game Kit for applications that use Game Kit’s voice chat service
AVAudioSessionModeVideoChat AVAudioSessionCategoryPlayAndRecord If the application is conducting online video conferencing, specify this mode
AVAudioSessionModeSpokenAudio AVAudioSessionCategoryPlayback Select this mode when you want to continue playing a voice and pause playing a short voice when other programs play a short voice
AVAudioSessionModeMeasurement AVAudioSessionCategoryPlayAndRecord AVAudioSessionCategoryRecord, AVAudioSessionCategoryPlayback Specify this mode if your application is performing audio input or output measurements

3.2.2 Configuring an Audio Session

3.2 AVAudioRecorder recording function

3.2.1 Recording Function Details

3.2.1.1 Configuring the Audio Session Mode for Recording

  • Above we have explained in detail the application scenarios of the session mode configuration, which should be used by recording and playing applicationsAVAudioSessionCategoryPlayAndRecordClasses to configure sessions.
    AVAudioSession *session = [AVAudioSession sharedInstance];

    NSError *error;
    if(! [session setCategory:AVAudioSessionCategoryPlayAndRecord error:&error]) {
        NSLog(@"Category Error: %@", [error localizedDescription]);
    }

    if(! [session setActive:YES error:&error]) {
        NSLog(@"Activation Error: %@", [error localizedDescription]);
    }
Copy the code

3.2.1.2 Setting General Parameters for Recording

  • Audio formats
  1. The AVFormatIDKey corresponds to the audio format of the content to be written, which has the following optional values: kAudioFormatLinearPCM kAudioFormatMPEG4AAC kAudioFormatAppleLossless kAudioFormatAppleIMA4 kAudioFormatiLBC kAudioFormatULaw
  2. The kAudioFormatLinearPCM will write the uncompressed audio stream to a large file. The kAudioFormatMPEG4AAC and kAudioFormatAppleIMA4 compressed formats significantly shrink files and guarantee high-quality audio content. Note, however, that the specified audio format should be compatible with the file type, for example the WAV format corresponds to the kAudioFormatLinearPCM value.
  • Sampling rate

AVSampleRateKey indicates the sampling rate, which is the number of samples per second of the input analog audio signal. Common values: 8000,16000,22050,44100. Sampling rates play a critical role in the quality of recorded audio and the size of the final file. Using low sampling rates, such as 8kHz, results in coarse-grained, AM broadcast-type recording effects, but with smaller files; Using a sample rate of 44.1khz (cD-quality sample rate) gives you very high quality of your day, but larger files. There is no clear definition of which sampling rate is best, but developers should try to use standard sampling rates such as 8000, 16000, 22050, and 44100. Ultimately, it’s our ears that do the judging.

  • The channel number

AVNumberOfChannelsKey indicates the number of channels defined for recording audio content, specifying a default value of 1 to mono recording and 2 to stereo recording. Mono is usually chosen (that is, AVNumberOfChannelsKey=1) unless external hardware is used for recording.

  • Encoding bit depth

AVEncoderBitDepthHintKey indicates the encoding bit depth, from 8 to 32.

  • Audio quality

AVEncoderAudioQualityKey indicates audio quality, and the optional values are: AVAudioQualityMin, AVAudioQualityLow, AVAudioQualityMedium, AVAudioQualityHigh, AVAudioQualityMax.

3.2.1.3 Initializing the AVAudioRecorder Object

  • The following information is required to create AVAudioRecorder:
  1. The URL of the local file used to write audio
  2. A dictionary used to configure key values for recording sessions
  3. NSError used to catch errors
  • At initialization, the prepareToRecord method is called to perform the necessary initialization of the underlying Audio Queue and create the file at the specified location.
  • The initialization code is as follows:
        NSString *tmpDir = NSTemporaryDirectory(a);NSString *filePath = [tmpDir stringByAppendingPathComponent:@"memo.caf"];
        NSURL *fileURL = [NSURL fileURLWithPath:filePath];

        NSDictionary *settings = @{
                                   AVFormatIDKey : @(kAudioFormatAppleIMA4),
                                   AVSampleRateKey : @44100.0f,
                                   AVNumberOfChannelsKey : @1.AVEncoderBitDepthHintKey : @16.AVEncoderAudioQualityKey: @ (AVAudioQualityMedium)};NSError *error;
        self.recorder = [[AVAudioRecorder alloc] initWithURL:fileURL settings:settings error:&error];
        if (self.recorder) {
            self.recorder.delegate = self;
            self.recorder.meteringEnabled = YES;
            [self.recorder prepareToRecord];
        } else {
            NSLog(@"Error: %@", [error localizedDescription]);
        }
Copy the code
  • The above code was recorded in a file named memo.cat in the TMP directory. The.caf (Core Audio Format) Format is usually the best container Format for recording Audio, since it is content neutral and can retain any Audio Format supported by Core Audio.

  • In addition, we need to define recording Settings to adapt to Apple IMA4 as an audio format, sampling rate of 44.1khz, bit depth of 16 bits, mono recording. These Settings allow for a balance between quality and file size.

3.2.1.4 Saving Recording Files

3.2.2 Complete recording code

  • OC recording code is as follows:
@interface ViewController ()
@property (nonatomic,strong) AVAudioRecorder *recorder;
@end

@implementation ViewController
 / / lazy loading- (AVAudioRecorder *)recorder{
      if (_recorder == nil) {
          //1. Create sandbox paths
          NSString *path = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory.NSUserDomainMask.YES) lastObject];
          //2. Splice audio files
          NSString *filePath = [path stringByAppendingPathComponent:@"123.caf"];
          //3. Convert to the url file://
          NSURL *url = [NSURL fileURLWithPath:filePath];
          //4. Set recording parameters
          NSDictionary *settings = @{
                                     /** Recording quality, Typedef NS_ENUM(NSInteger, AVAudioQuality) {AVAudioQualityMin = 0, AVAudioQualityLow = 0x20, AVAudioQualityMedium = 0x40, AVAudioQualityHigh = 0x60, AVAudioQualityMax = 0x7F }; * /
                                     AVEncoderAudioQualityKey : [NSNumber numberWithInteger:AVAudioQualityLow].AVEncoderBitRateKey : [NSNumber numberWithInteger:16].AVSampleRateKey : [NSNumber numberWithFloat:8000].AVNumberOfChannelsKey : [NSNumber numberWithInteger:2]};NSLog(@"% @",url);
          // The first argument is the URL where you want to save the recording
          // The second parameter is some recording parameters
          The third argument is an error message
          self.recorder = [[AVAudioRecorder alloc]initWithURL:url settings:settings error:nil];
      }
      return _recorder;
  }
  // Start recording
  - (IBAction)start:(id)sender {
      [self.recorder record];
  }
  // Stop recording
  - (IBAction)stop:(id)sender {
      [self.recorder stop];
  }
@end
Copy the code
  • Swift version recording code is as follows:
var recorder: AVAudioRecorder?
var player: AVAudioPlayer?
let file_path = PATH_OF_CACHE.appending("/record.wav")
var mp3file_path = PATH_OF_CACHE.appending("/audio.mp3")

private static var _sharedInstance: AudioRecordManager?
private override init() {}// Privatize init methods

/ / / the singleton
///
/// - Returns: a singleton object
class func shared() - >AudioRecordManager {
    guard let instance = _sharedInstance else {
        _sharedInstance = AudioRecordManager(a)return _sharedInstance!
    }
    return instance
}

/// Destroy singletons
class func destroy(a){
    _sharedInstance = nil
}

// Start recording
func beginRecord(a) {
    let session = AVAudioSession.sharedInstance()
    // Set the session type
    do {
        try session.setCategory(AVAudioSessionCategoryPlayAndRecord)}catch let err{
        Dprint("Setting type failed:\(err.localizedDescription)")}// Set the session action
    do {
        try session.setActive(true)}catch let err {
        Dprint("Initialization action failed:\(err.localizedDescription)")}// Recording Settings, note that you need to convert to NSNumber, if you do not convert, you will find that you cannot record audio files, I guess because the underlying is still written in OC
    let recordSetting: [String: Any] = [AVSampleRateKey: NSNumber(value: 44100.0),/ / sampling rate
        AVFormatIDKey: NSNumber(value: kAudioFormatLinearPCM),// Audio format
        AVLinearPCMBitDepthKey: NSNumber(value: 16),// The number of samples
        AVNumberOfChannelsKey: NSNumber(value: 2),/ / channel number
        AVEncoderAudioQualityKey: NSNumber(value: AVAudioQuality.min.rawValue)// Recording quality
    ];
    // Start recording
    do {
        let url = URL(fileURLWithPath: file_path)
        recorder = try AVAudioRecorder(url: url, settings: recordSetting) recorder! .prepareToRecord() recorder! .record()Dprint("Start recording.")}catch let err {
        Dprint("Recording failed:\(err.localizedDescription)")}}var stopRecordBlock:((_ audioPath:String._ audioFormat:String) - > ())?// End the recording
func stopRecord(a) {
    let session = AVAudioSession.sharedInstance()
    // Set the session type
    do {
        try session.setCategory(AVAudioSessionCategoryPlayback)}catch let err{
        Dprint("Setting type failed:\(err.localizedDescription)")}// Set the session action
    do {
        try session.setActive(true)}catch let err {
        Dprint("Initialization action failed:\(err.localizedDescription)")}if let recorder = self.recorder {
        if recorder.isRecording {
            Dprint("Recording, end it now, file saved to:\(file_path)")
            let manager = FileManager.default
            if manager.fileExists(atPath: mp3file_path) {
                do {
                    try manager.removeItem(atPath: mp3file_path)
                } catch let err {
                    Dprint(err)
                }
            }
            AudioWrapper.audioPCMtoMP3(file_path, andPath: mp3file_path)
            Dprint("Recording, end it now, file saved to:\(mp3file_path)")
            if let block = stopRecordBlock {
                block("/audio.mp3"."mp3")}}else {
            Dprint("No recording, but finish it anyway.")
        }
        recorder.stop()
        self.recorder = nil
    }else {
        Dprint("No initialization")}}// Cancel recording
func cancelRecord(a) {
    if let recorder = self.recorder {
        if recorder.isRecording {
            recorder.stop()
            self.recorder = nil}}}/ / / initialized
func initLocalPlay(a) {
    do {
        Dprint(mp3file_path)
        player = try AVAudioPlayer(contentsOf: URL(fileURLWithPath: mp3file_path)) player? .delegate =self
        Dprint("Song Length:\(player! .duration)")}catch let err {
        Dprint("Playback failed:\(err.localizedDescription)")}}// Play local audio files
func play(a){ player? .play() }// Pause local audio
func stop(a){ player? .pause() }var localPlayFinishBlock:(()->())?
func audioPlayerDidFinishPlaying(_ player: AVAudioPlayer, successfully flag: Bool) {
    if let block = AudioRecordManager.shared().localPlayFinishBlock {
        block()
    }
}
// Progress bar
func progress(a)->Double{
    
    return(player? .currentTime)! /(player? .duration)! }Copy the code

4. Visual audio signal

  • The most powerful and useful feature in AVAudioRecorder and AVAudioPlayer is the measurement of audio. Audio Metering allows developers to read the average decibel and peak decibel of Audio, and use this data to visually present the volume of sound to end users.
  • AVAudioRecorder and AVAudioPlayer both have two methods to obtain the average decibel and peak decibel data of the current audio, as follows:
- (float)averagePowerForChannel:(NSUInteger)channelNumber; /* returns average power in decibels for a given channel */
- (float)peakPowerForChannel:(NSUInteger)channelNumber; /* returns peak power in decibels for a given channel */
Copy the code
  • Return values range from -160dB (mute) to 0dB (maximum decibel)
  • To support audio measurement, set meteringEnabled to YES when initializing the player or logger before obtaining the value.
  • The first step is to convert the decibel value from -160 to 0 to the range 0 to 1, as follows:
@implementation THMeterTable {
    float _scaleFactor;
    NSMutableArray *_meterTable;
}

- (id)init {
    self = [super init];
    if (self) {
        float dbResolution = MIN_DB / (TABLE_SIZE - 1);

        _meterTable = [NSMutableArray arrayWithCapacity:TABLE_SIZE];
        _scaleFactor = 1.0f / dbResolution;

        float minAmp = dbToAmp(MIN_DB);
        float ampRange = 1.0 - minAmp;
        float invAmpRange = 1.0 / ampRange;

        for (int i = 0; i < TABLE_SIZE; i++) { float decibels = i * dbResolution; float amp = dbToAmp(decibels); float adjAmp = (amp - minAmp) * invAmpRange; _meterTable[i] = @(adjAmp); }}return self;
}

float dbToAmp(float dB) {
    return powf(10.0f, 0.05f * dB);
}

- (float)valueForPower:(float)power {
    if (power < MIN_DB) {
        return 0.0f;
    } else if (power >= 0.0f) {
        return 1.0f;
    } else {
        int index = (int) (power * _scaleFactor);
        return [_meterTable[index] floatValue];
    }
}

@end
Copy the code

The code above creates an internal array to hold the conversion results from the decibel count before calculation to the decibel resolution using a certain level. The resolution rate used here is -0.2dB. The resolution level is adjusted by changing the MIN_DB and TABLE_SIZE values.

Each db value is converted to a linear range of values between 0 (-60dB) and 1 by calling the dbToAmp function. A smooth curve with values in these ranges is then squared and kept in the internal lookup table. These values can be obtained later by calling the valueForPower method if needed.

  • The average and peak decibel values can be obtained in real time:
- (THLevelPair *)levels {
    [self.recorder updateMeters];
    float avgPower = [self.recorder averagePowerForChannel:0];
    float peakPower = [self.recorder peakPowerForChannel:0];
    float linearLevel = [self.meterTable valueForPower:avgPower];
    float linearPeak = [self.meterTable valueForPower:peakPower];
    return [THLevelPair levelsWithLevel:linearLevel peakLevel:linearPeak];
}
Copy the code

The above code first calls the recorder’s updateMeters method. This method must be called just before reading the current level value to ensure that the reading level is up to date. The average and peak values are then requested from channel 0. The channels are indexed 0. Since we use mono recording, we only need to ask the first channel. Then query the linear sound intensity value in the metering table and finally create a new THLevelPair instance.

  • Reading the audio intensity value is similar to requesting the current time, requiring the recorder to be polled whenever the latest value is required. We can useNSTimer, but since the measurement values used for display are updated frequently to keep the animation smooth, we recommend itCADisplayLinkTo update.
  • CADisplayLinkandNSTimerSimilar, but it can be automatically synchronized with the display refresh rate.

5. Exception handling

Reference books: “AV Foundation Development Secrets”, “Audio and Video Development Advanced Guide based on Android and iOS platform practice”