Voice denoising based on matlab spectrum subtraction denoising

A list,

In speech denoising is the most commonly used method in spectral subtraction, spectral subtraction is a kind of development and application of a mature early speech denoising algorithms, the algorithm using the additive noise and not related to the characteristics of the voice, under the hypothesis that the smooth noise is statistics, with no clearance measurement to replace a speech during the noise spectrum estimate of the noise spectrum, with the voice signals with noise spectrum subtraction, Thus the speech spectrum is estimated. Spectral subtraction is widely used because of its simple algorithm and small amount of computation. It is easy to realize fast processing and can obtain high output signal-to-noise ratio. The shortcoming of the classical form of the algorithm is that the “music noise” with certain rhythmic fluctuation will be generated after processing.

When converted to the frequency domain, these peaks sound like multiple tones with random frequency changes from frame to frame. This is particularly pronounced in silent segments. This “noise” due to half-wave rectification is called “musical noise”. Fundamentally, the main causes of music noise are:

(1) The negative part of spectral subtraction algorithm is treated nonlinear

(2) Inaccurate estimation of noise spectrum

(3) Suppression function (gain function) has great variability

1 the principle

2 the flow chart

Disadvantages of spectral subtraction

1) As a result of half-wave rectification of negative values, small, independent peaks appear at random frequencies in the frame spectrum, which translate into the time domain. These peaks sound like multiple trills with random frequency variations from frame to frame, commonly known as “Musical Noise”.

2) In addition, the spectral subtraction method also has a small disadvantage that it uses the phase of noisy speech as the phase of the enhanced speech, so the quality of the generated speech may be rough, especially under the condition of low signal-to-noise ratio, which may reach the level of auditory perception and reduce the quality of speech.

In order to better understand spectral subtraction speech enhancement, a simple simulation of the algorithm is carried out here, and the simulation parameters are set as follows

Ii. Source code

winsize=256; % window length n =0.23; % noise level a=4; b=6;
[speech,fs,nbits]=wavread('speech_dft.wav'); % read wav file speech=speech(:,1); size=length(speech); % Voice length numofwin=floor(size/winsize); Ham =hamming(winsize)'; hamwin=zeros(1,size); enhanced=zeros(1,size); improved=zeros(1,size); % Generated noise signal noise=n*randn(1,size); y=speech'+noise; % noise handling Noisy =n*randn(1,winsize);
N=fft(noisy);
npow=abs(N);
for q=1:2*numofwin- 1
    yframe=y(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2); % framing hamwin (1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)=hamwin(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)+ham; Y1 = FFT (yframe.*ham); ypow=abs(y1); % Noise signal amplitude yangle= Angle (y1); % phase % Calculate the power spectral density Py=ypow.^2;
    Pn=npow.^2; Pyy=ypow.^a; Pnn=npow.^a; % basic spectrum subtractionfor i=1:winsize
        if Py(i)-Pn(i)>0
            Ps(i)=Py(i)-Pn(i);
        else
            Ps(i)=0;
        end
    end
    s=sqrt(Ps).*exp(1i*yangle);
            for i=1:winsize
                if Pyy(i)-b*Pnn(i)>0
                    Pss(i)=Pyy(i)-b*Pnn(i);
                else
                    Pss(i)=0;
                end
            end
            ss=Pss.^(1/a).*exp(1i*yangle); % denoised speechIFFT
            enhanced(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)=enhanced(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)+real(ifft(s));
            improved(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)=improved(1+(q- 1)*winsize/2:winsize+(q- 1)*winsize/2)+real(ifft(ss)); End % remove the gain caused by hamming windowfor i=1:size
    if hamwin(i)==0
        enhanced(i)=0;
        improved(i)=0;
    else
        enhanced(i)=enhanced(i)/hamwin(i);
        improved(i)=improved(i)/hamwin(i);
    end
end
Copy the code

3. Operation results

Fourth, note

Version: 2014 a

Voice denoising based on matlab spectrum subtraction denoising

A list,

Ii. Source code

3. Operation results

Fourth, note

Related Posts

How does AI technology ensure safe production in dangerous industries?

Detectron2 is used for target detection in six steps

[Source code] Facebook how to train the super model — (3)