Svm (Support Vector Mac), also known as support Vector machine, is a binary model. Of course, if modified, it can also be used for the classification of multi-category problems. Support vector machines can be divided into linear kernel nonlinear two categories. The main idea is to find a hyperplane in the space that can divide all the data samples and make the distance of all data in the set to this hyperplane the shortest.

A,Separate data based on the maximum interval

1.1 Support vectors and hyperplanes

Before learning about SVM algorithms, we first need to understand the concept of linear classifier. For example, given a series of data samples, each sample has a corresponding label. In order to make the description more intuitive, we use the two-dimensional plane to explain, the principle of higher dimensional space is the same. Take a simple example: The figure below shows a two-dimensional plane with two different types of data, represented by circles and squares. We can easily find a line that separates the two types of data exactly. But there is more than one line that can completely divide the point, so which one of these lines should we choose? If you look at the straight lines in this picture,Is it better? Yeah, we want to find a line that has the shortest distance from the closest point to the line. This is a mouthful, but we can see from figure 3 that the maximum distance between the two outer lines is required. This is understandable, because if the data sample is random, the higher the probability that the data point will fall to the side of its category after such segmentation, the higher the accuracy of the final prediction. In higher dimensional space, lines like this are called hyperplanes, because we can’t imagine what the plane looks like when the dimensions are greater than three. The points closest to the hyperplane are the so-called support vectors, and in fact if you define the support vectors you define the hyperplane, you find those support vectors and then the other samples don’t work.

Figure 1, figure 2

 

1.2 Finding the maximum interval

1.2.1 Distance formula from point to hyperplane

If such a line exists, how do we find it? Similar to two-dimensional space, the equation of a hyperplane can be written as follows:

                                       (1.1)

So once we have this expression for the hyperplane, then we can calculate the distance from the sample point to the plane. Assuming thatIs a point in the sample, whereIs the first characteristic variable. So the distance from that point to the hyperplaneIt can be calculated using the following formula:

                                                           (1.2)

 

The | | W | | for hyperplane norm, constant b is similar to the intercept of linear equation.

The formulas above can be derived using analytic geometry or high school plane geometry and will not be explained further here.

1.2.2 Optimization model of maximum interval

Now we have known how to calculate the distance between the data point and the hyperplane. In the case that the hyperplane is determined, we can find out all the support vectors and then calculate the interval margin. Every hyperplane corresponds to a margin, and our goal is to find the hyperplane corresponding to the largest value of all margins. Therefore, the description in mathematical language is to determine the maximum margin of W and B. This is an optimization problem whose objective function can be written as:

                (1.3)

Among themA label that represents a data point and is either -1 or 1. Distance withSo if you do that, you can see the benefits of minus 1 and 1. If the data point is in the positive direction of the plane (i.e. +1 class) thenIs a positive number, whereas when the data point is in the negative direction of the plane (that is, type -1),It’s still a positive number, so it’s always going to be greater than zero. Notice that when W and B are equally scaled up, the result of D does not change. So we can make u equal to 1 for all the support vectors, and u greater than 1 for all the other points and we can do that by adjusting w and b. So the above question can be simplified as:(1.4)

For the convenience of subsequent calculation, the objective function is equivalently replaced by:

                       (1.5)

This is an optimization problem with constraints, usually we can use the Lagrange multiplier method to solve. An introduction to Lagrange multipliers can be found in this blog post. The Lagrange multiplier method is applied as follows:

make(1.6)

The partial derivative of L with respect to is obtained:(1.7)

Substituting (1.7) into (1.6) simplifies to:

                        (1.8)

The dual problem of the original problem is:

                     (1.9)

The KKT condition of the dual problem is

(1.10)

At this point, the problem seems to be perfectly solved. But there is an assumption: the data must be 100 percent divisible. But the actual data is almost always not so “clean”, more or less there is some noise. For this reason, we will introduce relaxation variables to solve this problem.

1.2.3 Relaxation variables

From the analysis in the previous section, we know that in practice, many sample data cannot be completely separated by a hyperplane. If there is noise in the data set, there will be a big problem in finding hyperflat. It can be seen from FIG. 3 that one of the blue dots has a large deviation. If it is taken as a support vector, the margin obtained will be much smaller than when it is not included. And what’s even worse is if the blue dot falls between the red dot then you can’t find the hyperplane.

                                    

Figure 3

So a slack variable ξ is introduced to allow some data to be on the wrong side of the partition. Then the new constraint becomes:

(1.11)

ξ I in the formula means the allowable deviation interval of the i-th data point. If I make it arbitrarily large, then an arbitrary hyperplane is sufficient. So in addition to the original goal, we made the total number of ξ as small as possible. So the new objective function becomes:

(1.12)

(1.13)

Where C is used to control the weight of the two objectives of “maximizing the interval” and “ensuring that the function interval of most points is less than 1”. To write down the above model in its entirety:

(1.14)

The new Lagrange function becomes:

(1.15)

And then I’m going to convert the Lagrange function to its dual function, first of all, to theO, respectively,ξ partial derivative, and set it to 0. The result is as follows:

(1.16)

Substitution into the original formula after simplification, the same objective function as the original is obtained:

(1.17)

But since we havewhile, so there areSo the duality problem is written as:

(1.18)

By adding the relaxation variable, we can now solve the problem of more cluttered data. By modifying parameter C, we can get different results. However, the appropriate size of C needs to be adjusted according to actual problems.

1. The kernel function

All the above are discussed in the case of linear divisibility, but the data given in practical problems are not all linearly divisible. For example, some data may be as shown in Figure 4.

Figure 4.

So can’t this kind of nonlinear separable data be solved by SVM algorithm? The answer is no. In fact, non-separable data in a low-dimensional plane may become separable when placed in a higher-dimensional space. Taking data from a two-dimensional plane as an example, we can find a mapping to put points from a two-dimensional plane into a three-dimensional plane. In theory, any data sample can find a suitable mapping to make these samples that cannot be divided in lower dimensional space linearly separable in higher dimensional space. Let’s look at the previous objective function:

(1.19)

Define a mapping such that mapping all to higher dimensions is equivalent to solving the duality problem of the above problem:

(1.20)

So that solves the problem of linear inseparability, and now we just have to find a suitable mapping. When you have a lot of characteristic variables it’s a huge amount of computation to compute the inner product in higher dimensions. Given that our goal is not to find such a mapping but to compute its inner product in higher dimensions, if we can find a formula to compute the inner product in higher dimensions, then we can avoid such a huge amount of computation, and our problem will be solved. That’s actually the kernel we’re looking forIs the inner product of two vectors in the implicitly mapped space. The following simple example can help us understand the kernel better.

From the above examples, we can clearly see how the kernel function works. The dual problem of the above problem can be written as follows:

(1.21)

So what kind of function can be a kernel? The following theorem can help us determine.

Mercer’s theorem: any semidefinite function can be used as a kernel. It is called a semidefinite functionIt means that we have the training set data set, and we define the elements of a matrixThe matrix is zeroIf this matrix is semidefinite, thenIt’s called a semidefinite function.

It is worth noting that the conditions given in the above theorem are sufficient conditions rather than sufficient and necessary conditions. Because some non-positive definite functions can also be used as a kernel.

Here are some common kernel functions:

Table 1 Common kernel functions

Kernel function name Kernel expression Kernel function name Kernel expression
The linear nuclear Index of the nuclear
Polynomial kernel Laplace nucleus
Gaussian kernel Sigmoid nuclear


Now that we have understood some of the theoretical underpinnings of support vector machines, we will begin by solving the duality problem by transforming itThe problem of phi becomes the problem of phiThe duality problem of. Just find all of themFind all the support vectors, and we can determine. You can then determine the type of data point by calculating the distance between the data point and the hyperplane.

function varargout = main_gui(varargin)
% MAIN_GUI MATLAB code for main_gui.fig
%      MAIN_GUI, by itself, creates a new MAIN_GUI or raises the existing
%      singleton*.
%
%      H = MAIN_GUI returns the handle to a new MAIN_GUI or the handle to
%      the existing singleton*.
%
%      MAIN_GUI('CALLBACK',hObject,eventData,handles,...) calls the local
%      function named CALLBACK in MAIN_GUI.M with the given input arguments.
%
%      MAIN_GUI('Property','Value',...) creates a new MAIN_GUI or raises the
%      existing singleton*.  Starting from the left, property value pairs are
%      applied to the GUI before main_gui_OpeningFcn gets called.  An
%      unrecognized property name or invalid value makes property application
%      stop.  All inputs are passed to main_gui_OpeningFcn via varargin.
%
%      *See GUI Options on GUIDE's Tools menu.  Choose "GUI allows only one
%      instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES
 
% Edit the above text to modify the response to help main_gui
 
% Last Modified by GUIDE v2.5 29-Dec-2018 17:29:22
 
% Begin initialization code - DO NOT EDIT
gui_Singleton = 1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @main_gui_OpeningFcn, ...
                   'gui_OutputFcn',  @main_gui_OutputFcn, ...
                   'gui_LayoutFcn',  [] , ...
                   'gui_Callback',   []);
if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end
 
if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
 
 
% --- Executes just before main_gui is made visible.
function main_gui_OpeningFcn(hObject, eventdata, handles, varargin)
% This function has no output args, see OutputFcn.
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% varargin   command line arguments to main_gui (see VARARGIN)
 
% Choose default command line output for main_gui
handles.output = hObject;
 
% Update handles structure
guidata(hObject, handles);
 
% UIWAIT makes main_gui wait for user response (see UIRESUME)
% uiwait(handles.figure1);
 
 
% --- Outputs from this function are returned to the command line.
function varargout = main_gui_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
 
% Get default command line output from handles structure
varargout{1} = handles.output;
 
 
% --- Executes on button press in pushbutton1.
function pushbutton1_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton1 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
global str img cc
[filename,pathname] = uigetfile({'*.jpg';'*.bmp'},'选择图片');
str = [pathname,filename];
img = imread(str);
cc=imread(str);
subplot(1,3,1),imshow(cc);
set(handles.text5,'string',str);
 
 
% --- Executes on button press in pushbutton3.
function pushbutton3_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton3 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
close(gcf);
 
 
 
 
% --- Executes on button press in pushbutton4.
function pushbutton4_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton4 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
global cc img t
load('save.mat');
 mapping=getmapping(8,'u2');%LBP映射
        W=[2,1,1,1,1,1,2; ...
          2,4,4,1,4,4,2; ...
          1,1,1,0,1,1,1; ...
          0,1,1,0,1,1,0; ...
          0,1,1,1,1,1,0; ...
          0,1,1,2,1,1,0; ...
          0,1,1,1,1,1,0]; 
d=[];
 image_size = size(cc);
   dimension = numel(image_size);
   if dimension == 3
      cc=rgb2gray(cc);
   end
   
       X = double(cc);
      X=255*imadjust(X/255,[0.3;1],[0;1]);
  X = imresize(X,[64 64],'bilinear');  %采用'bilinear':采用双线性插值算法扩展为64*64
  H2=DSLBP(X,mapping,W);%提取图片的LBP直方图
  Gray=X;
  Gray=(Gray-mean(Gray(:)))/std(Gray(:))*20+128;
  lpqhist=lpq(Gray,3,1,1,'nh');       %计算每个照片lpq直方图 
  a=[H2,lpqhist];
  d=[d;a];
 
P_test=d;
P_test=mapminmax(P_test,0,1);
%%%%%%%%以上是特征提取的部分
 
 
%%%%%从这里开始是识别表情的算法,使用支持向量机来识别
addpath SVM-KM  %%添加支持向量机工具箱
c = 100;
 
kerneloption= 1.3;   %设置核参数
kernel='gaussian'; %设置高斯核作为支持向量机的核函数
[ypred2,maxi] = svmmultival(P_test,xsup,w,b,nbsv,kernel,kerneloption);
 
   for i=1:length(ypred2)
    if ypred2(i)==1  disp('Anger');t='Anger';
    elseif ypred2(i)==2     t='Disgust';   
    elseif ypred2(i)==3     t='Fear';  
    elseif ypred2(i)==4    t='Happiness';
    elseif ypred2(i)==5    t='Sad';
    elseif ypred2(i)==6    t='Surprise';
    end
    detector = vision.CascadeObjectDetector;
    bboxes=step(detector,img);
    FrontalFaceCART=insertObjectAnnotation(img,'rectangle',bboxes,t,'color','cyan','TextBoxOpacity',0.8,'FontSize',13);
    subplot(1,3,2),imshow(FrontalFaceCART);
   end
    set(handles.text10,'string',t);
 
% --- Executes during object creation, after setting all properties.
function text5_CreateFcn(hObject, eventdata, handles)
% hObject    handle to text5 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    empty - handles not created until after all CreateFcns called
 
 
% --- Executes on button press in pushbutton6.
function pushbutton6_Callback(hObject, eventdata, handles)
% hObject    handle to pushbutton6 (see GCBO)
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
 
load('save.mat');
disp('训练结束');
axes(handles.axes2);
vid = videoinput('winvideo',1,'YUY2_640x480');
set(vid,'ReturnedColorSpace','rgb');
vidRes = get(vid, 'VideoResolution');
nBands = get(vid, 'NumberOfBands');
hImage = image( zeros(vidRes(2), vidRes(1), nBands) );
preview(vid, hImage);
disp('摄像头开启');
faceDetector1 = vision.CascadeObjectDetector;
while(1)
frame = getsnapshot(vid);
box = step(faceDetector1, frame); % Detect faces
if isempty(box)==0
    ff=imcrop(frame,[box(1),box(2),box(3),box(4)]);
%figure;imshow(cc);
    ff=rgb2gray(ff);
%figure;imshow(cc);
    ff=histeq(ff);       %直方图均衡化
%     imwrite(cc,'.\test\1.jpg');
    yy=svm_test(xsup,w,b,nbsv,ff);
    h=rectangle('position',[box(1),box(2),box(3),box(4)],'LineWidth',2,'edgecolor','b');
    for i=1:length(yy)
        if yy(i)==1      t1=text(box(1),box(2)-20, sprintf('生气'),'FontSize',14,'Color','blue','FontWeight','Bold');
        elseif yy(i)==2     t1=text(box(1),box(2)-20, sprintf('厌恶'),'FontSize',14,'Color','blue','FontWeight','Bold');
        elseif yy(i)==3       t1=text(box(1),box(2)-20, sprintf('恐惧'), 'FontSize',14,'Color','blue','FontWeight','Bold');      
        elseif yy(i)==4       t1=text(box(1),box(2)-20, sprintf('高兴 '), 'FontSize',14,'Color','blue','FontWeight','Bold'); 
        elseif yy(i)==5       t1=text(box(1),box(2)-20, sprintf('悲伤'), 'FontSize',14,'Color','blue','FontWeight','Bold');      
        elseif yy(i)==6       t1=text(box(1),box(2)-20, sprintf('惊讶'), 'FontSize',14,'Color','blue','FontWeight','Bold');
        end
    end
      pause(0.05);
      set(t1,'string',[]);
      delete(h);
        if strcmpi(get(gcf,'CurrentCharacter'),'c')
         delete(vid);
         disp('程序退出');
         break;
        end
 
      
else t1=text(10,10, sprintf('未检测到人脸'), 'FontAngle','italic','FontSize',15,'Color','b','FontWeight','Bold');
     pause(0.05);
     set(t1,'string',[]);
     if strcmpi(get(gcf,'CurrentCharacter'),'1')
         delete(vid);
         disp('程序退出');
         break;
     end
end
end
Copy the code