Make writing a habit together! This is the 13th day of my participation in the “Gold Digging Day New Plan · April More Text Challenge”. Click here for more details.

Construct deep neural network to improve the accuracy of the model

The neural network we used in the previous model had only one hidden layer between the input layer and the output layer. In this section, we will learn to use multiple hidden layers (hence the name deep neural network) in a neural network to explore the effect of network depth on model performance.

Deep neural network means that there are multiple hidden layers between input layer and output layer. Multiple hidden layers ensure that neural networks can learn complex nonlinear relationships between inputs and outputs that simple neural networks cannot. The classical deep neural network architecture is shown as follows:

Build a deep neural network architecture by adding multiple hidden layers between the input and output layers, as shown below.

  1. Load the dataset and scale the dataset:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

num_pixels = x_train.shape[1] * x_train.shape[2]
x_train = x_train.reshape(-1, num_pixels).astype('float32')
x_test = x_test.reshape(-1, num_pixels).astype('float32')
x_train = x_train / 255.
x_test = x_test / 255.

y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]
Copy the code
  1. Build models with multiple hidden layers between input and output layers:
model = Sequential()
model.add(Dense(512, input_dim=num_pixels, activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
Copy the code

The model information for the model architecture is as follows:

Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 512) 401920 _________________________________________________________________ dense_1 (Dense) (None, 1024) 525312 _________________________________________________________________ dense_2 (Dense) (None, 64) 65600 _________________________________________________________________ dense_3 (Dense) (None, 10) 650 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Total params: 993482 Trainable params: 993482 Non - trainable params: 0 _________________________________________________________________Copy the code

Since the deep neural network architecture contains more hidden layers, the model also contains more parameters.

  1. Once the model is built, you can compile and fit the model:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

history = model.fit(x_train, y_train,
                    validation_data=(x_test, y_test),
                    epochs=50,
                    batch_size=64,
                    verbose=1)
Copy the code

The accuracy of the trained model is about 98.9%, which is slightly better than that obtained by the previous model architecture, due to the relatively simple MNIST data set. The loss and accuracy of training and testing are as follows:

As can be seen in the figure above, the accuracy of training data set is superior to that of test data set to a large extent, which indicates that the deep neural network overfits the training data. In future studies, we will learn about methods to avoid over-fitting of training data.

A link to the

Keras deep learning — Training primitive neural networks