This is the 21st day of my participation in the August More Text Challenge

Artificial Intelligence Audio Processing Library — Librosa (Installation and Use)

The preface

Install libsora

pypi

conda

source

Common functions of Librosa

Core audio processing functions

Audio processing

Spectrum said

Amplitude conversion

The time-frequency transformation

Feature extraction

The drawing shows

Three, common function code implementation

Read the audio

The extracted features

Extraction of log-Mel Spectrogram characteristics

Extract MFCC features

The drawing shows

Draw sound waveform

Spectrum mapping


The preface

Librosa is a Python toolkit for audio and music analysis and processing. Librosa is a Python toolkit for audio and music analysis and processing. Librosa is a Python toolkit for audio and music analysis and processing. This article mainly introduces the installation and use of Librosa.


Install libsora

The Librosa website provides a variety of installation methods, as detailed below:

pypi

The easiest way to do this is to install PIP, which satisfies all dependencies, with the following command:

pip install librosa

Copy the code

conda

If Anaconda is installed, run the conda command to install it:

conda install -c conda-forge librosa

Copy the code

source

Download the source code in advance (github.com/librosa/lib…

tar xzf librosa-VERSION.tar.gz
cd librosa-VERSION/
python setup.py install
Copy the code

Common functions of Librosa

Core audio processing functions

This part introduces the most commonly used audio processing functions, including load(), resample(), STFT (), amplitude_to_DB () and frequency conversion function hz_to_MEL (). IO/librosa/core.html. For details, please refer to librosa.github

Audio processing

Spectrum said

Amplitude conversion

The time-frequency transformation

Feature extraction

This part lists some commonly used spectral feature extraction methods, including Mel Spectrogram, MFCC, CQT, etc. Function details may refer to http:// librosa. Making. IO/librosa/feature. HTML

The drawing shows

Contains common specshow spectrum display function (), waveform display function waveplot (), detailed information please refer to the librosa. Making. IO/librosa/dis… . html


Three, common function code implementation

1. Read the audio

# import libraries
import librosa
# # Read audio
# Load a wav file
y, sr = librosa.load('./sample.wav')
print(y)
If you want to read the original sample rate, set sr=None:
print(sr)
y, sr = librosa.load('./sample.wav',sr=None)
# see, the original sample rate of 'beat.wav' is 16000. If you need to re-sample, just set the sampling rate parameter sr to the value you want:
print(sr)
 
 
y, sr = librosa.load('./sample.wav',sr=18000)
print(sr)
Copy the code


2. Feature extraction

Extraction of log-Mel Spectrogram characteristics

The log-Mel Spectrogram feature is a feature commonly used in speech recognition and environmental sound recognition at present. Due to the strong ability of CNN in image processing, the spectral map feature of audio signal is more widely used, even more than that of MFCC. In Librosa, the extraction of log-Mel Spectrogram features takes only a few lines of code:

# # Extract features
# Load a wav file
y, sr = librosa.load('./sample.wav', sr=None)
# extract mel spectrogram feature
melspec = librosa.feature.melspectrogram(y, sr, n_fft=1024, hop_length=512, n_mels=128)
# convert to log scale
logmelspec = librosa.power_to_db(melspec)
print(logmelspec.shape)
Copy the code

It can be seen that the Log-Mel Spectrogram feature is in the form of a two-dimensional array, 128 represents the Mel frequency dimension (frequency domain) and 100 represents the time frame length (time domain), so the log-Mel Spectrogram feature is the time-frequency representation feature of the audio signal. Where n_fft refers to the size of the window, in this case 1024; Hop_length represents the distance between adjacent Windows, here is 512, namely 50% overlap between adjacent Windows; N_mels is the number of MEL bands, set to 128.


3. Extract MFCC features

MFCC feature is a feature widely used in automatic speech recognition and speaker recognition. Detailed information about the MFCC feature, interested can refer to the blog http:// blog.csdn.net/zzc15806/article/details/79246716. In Librosa, only one function is required to extract MFCC features:

# # Extract MFCC features
# extract mfcc feature
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=40)
print(mfccs)
print(mfccs.shape)
Copy the code

The MFCC will not be described here.

Librosa also has many other audio feature extraction methods, such as CQT feature, Chroma feature, etc., which are introduced in detail in the second part “Librosa common features”.


4. Drawing display

4.1 Drawing sound waveform

Waveplot () :

# # Drawing display
import librosa.display
import matplotlib.pyplot as plt
get_ipython().run_line_magic('matplotlib'.'inline')
plt.figure()
librosa.display.waveplot(y, sr)
plt.title('sample wavform')
plt.show()
Copy the code


4.2 Draw spectrum map

Librosa has the spectrumgraph waveform function specshow():

# # Map the spectrum
melspec = librosa.feature.melspectrogram(y, sr, n_fft=1024, hop_length=512, n_mels=128)
logmelspec = librosa.power_to_db(melspec)
plt.figure()
librosa.display.specshow(logmelspec, sr=sr, x_axis='time', y_axis='mel')
plt.title('sample wavform')
plt.show()
Copy the code


Plot the sound waveform and spectrum in a diagram:

# # Draw the sound waveform and spectrum in a chart:
# extract mel spectrogram feature
melspec = librosa.feature.melspectrogram(y, sr, n_fft=1024, hop_length=512, n_mels=128)
# convert to log scale
logmelspec = librosa.power_to_db(melspec)
plt.figure()
# plot a wavform
plt.subplot(2.1.1)
librosa.display.waveplot(y, sr)
plt.title('sample wavform')
# plot mel spectrogram
plt.subplot(2.1.2)
librosa.display.specshow(logmelspec, sr=sr, x_axis='time', y_axis='mel')
plt.title('Mel spectrogram')
plt.tight_layout() # Ensure graphs do not overlap
plt.show()
Copy the code

Here, the installation and simple use of Librosa is introduced. In fact, librosa is far more than these functions, please refer to the librosa official website for more usage methods of Librosa

Librosa. Making. IO/librosa/ind…

End of text!!