CNN realizes facial expression recognition

Author: Wei Zuchang

I. Background introduction

On January 29, 2020, an official of the Ministry of Education said in an interview that the prevention and control of COVID-19 is a top priority at present. Education departments at all levels are making all-out efforts to prevent the spread of COVID-19 in schools in accordance with the unified deployment requirements of the Ministry of Education and local Party committees and governments, and the extension of school opening is an important measure. At the same time, local education departments have also done a lot of work to ensure that primary and secondary schools will not stop teaching and learning during the epidemic. Online teaching was the answer.

However, with the development of network teaching, teachers can not timely know students’ learning status through the classroom, and students will not take learning as seriously as in the classroom. Since it’s online, we can solve this problem by using some of our deep learning techniques to help teachers observe students’ facial expressions. Starting with CNN, this paper realizes a depth model of facial expression recognition.

2. Data sets

Fer2013 facial expression data set consists of 35886 facial expression images, including 28708 Training images, 3589 public validation images and private validation images respectively. Each image is composed of grayscale images with a fixed size of 48×48. There are 7 emojis, corresponding to the numeric labels 0-6 respectively. The corresponding labels and English and Chinese emojis are as follows: 0 anger; 1 disgust; 2 fear C. 3. 4. 5. 6 Normal Neutral.

At the same time, we need to carry out data flipping, data rotation, image scaling, image clipping, image translation and adding noise to the image data. Some of you might wonder, don’t we already have the data? Why do you do that? I’m going to show you why.

This operation is actually called data enhancement. In deep learning, the number of samples is generally required to be sufficient. The more samples, the better the effect of the trained model and the stronger the generalization ability of the model. But in practice, the number of samples is insufficient or the quality of samples is not good enough, so it is necessary to enhance the data of samples to improve the quality of samples. The role of data enhancement is summarized as follows:

Increase the amount of training data and improve the generalization ability of the model
Add noise data to improve the robustness of the model

Figure 1: Operations in data enhancement

3. Model structure

We first read the data of Fer2013. Since it is a CSV file composed of grayscale images, it is somewhat different from conventional training data. First we need to read the CSV file:

def read_data_np(path):
   with open(path) as f:
   content = f.readlines()
   lines = np.array(content)
   num_of_instances = lines.size
   print("number of instances: ", num_of_instances)
   print("instance length: ", len(lines[1].split(",") [1].split("")))
   return lines, num_of_instances
Copy the code

The new shape is 0 after the file is read We are 0 0 The 48*48 image gray scale is 0 x_train, Y_train, X_test, y_test

def reshape_dataset(paths, num_classes):
    x_train, y_train, x_test, y_test = [], [], [], []

    lines, num_of_instances = read_data_np(paths)

    # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    # transfer train and test set data
    for i in range(1, num_of_instances):
        try:
            emotion, img, usage = lines[i].split(",")
    
            val = img.split("")

            pixels = np.array(val, 'float32')

            emotion = keras.utils.to_categorical(emotion, num_classes)

            if 'Training' in usage:
                y_train.append(emotion)
                x_train.append(pixels)
            elif 'PublicTest' in usage:
                y_test.append(emotion)
                x_test.append(pixels)
        except:
            print("", end="")

    # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
    # data transformation for train and test sets
    x_train = np.array(x_train, 'float32')
    y_train = np.array(y_train, 'float32')
    x_test = np.array(x_test, 'float32')
    y_test = np.array(y_test, 'float32')

    x_train /= 255  # normalize inputs between [0, 1]
    x_test /= 255

    x_train = x_train.reshape(x_train.shape[0].48.48.1)
    x_train = x_train.astype('float32')
    x_test = x_test.reshape(x_test.shape[0].48.48.1)
    x_test = x_test.astype('float32')

    print(x_train.shape[0].'train samples')
    print(x_test.shape[0].'test samples')

    y_train = y_train.reshape(y_train.shape[0].7)
    y_train = y_train.astype('int16')
    y_test = y_test.reshape(y_test.shape[0].7)
    y_test = y_test.astype('int16')

    print('--------x_train.shape:', x_train.shape)
    print('--------y_train.shape:', y_train.shape)


    print(len(x_train), 'train x size')
    print(len(y_train), 'train y size')
    print(len(x_test), 'test x size')
    print(len(y_test), 'test y size')

    return x_train, y_train, x_test, y_test
Copy the code

The model mainly uses three convolution layers, two full connection layers, and finally outputs the possibilities of each category through a Softmax. The main model code is as follows:

def build_model(num_classes):
  # construct CNN structure
  model = Sequential()
  # 1st convolution layer
  model.add(Conv2D(64, (5.5), activation='relu', input_shape=(48.48.1)))
  model.add(MaxPooling2D(pool_size=(5.5), strides=(2.2)))

  # 2nd convolution layer
  model.add(Conv2D(64, (3.3), activation='relu'))
  # model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(AveragePooling2D(pool_size=(3.3), strides=(2.2)))

  # 3rd convolution layer
  model.add(Conv2D(128, (3.3), activation='relu'))
  # model.add(Conv2D(128, (3, 3), activation='relu'))
  model.add(AveragePooling2D(pool_size=(3.3), strides=(2.2)))

  model.add(Flatten())

  # fully connected neural networks
  model.add(Dense(1024, activation='relu'))
  model.add(Dropout(0.2))
  model.add(Dense(1024, activation='relu'))
  model.add(Dropout(0.2))

  model.add(Dense(num_classes, activation='softmax'))

return model
Copy the code

4. Training results

Steps_per_epoch =256 and EPOCH =5 were set to enhance the data. The experimental results are shown as follows:

Figure 2: Training results of data enhancement

The data is not enhanced, and the experimental results are shown as follows:

Figure 3: Training results without data enhancement

We can obviously see that it is true that the accuracy is higher without enhancement, and it is also true that the enhancement of the model’s generalization ability will inevitably reduce the accuracy of the model during training. However, we can observe the difference between the two models through the following additional images. The effect is as follows (data enhancement on the left, data not enhancement on the right) :

According to the experimental results in the figure above, it can be found that in the happy picture, the one judged by data enhancement is more accurate than the one judged by no judgment. Meanwhile, when judging Mona Lisa by data enhancement model, it is more consistent with the common sense of mystery of Mona Lisa.

Five, the summary

In this paper, we simply use CNN network, although the accuracy is not so high. But we can use our existing technology to help online teaching, to help teachers analyze students’ listening status. At the same time, if students know that there is a pair of invisible eyes staring at him, maybe students will also treat online teaching in class attitude, improve the teaching efficiency of online teaching.

Project Address:Momodel. Cn/workspace / 5…(It is recommended to open it on the PC using Google Chrome.)

reference

Deep Learning: Why Data Enhancement?
Reference code: github.com/naughtybaby…

Mo (momodel.cn) is a Python-enabled online modeling platform for artificial intelligence that helps you quickly develop, train, and deploy models.

Recently, Mo is also continuing to conduct introductory courses and paper sharing activities related to machine learning. Welcome to follow our official account for the latest information!

CNN realizes facial expression recognition

I. Background introduction

2. Data sets

3. Model structure

4. Training results

Five, the summary

reference

Related Posts

Simplified edition of Deep Learning 01 Introduction

Graphic python | the time and date processing

Python Data Analysis – GRAPHICAL display of PMI data