Keras is used to construct a simple Mnist fully connected neural network

Note: This article is the second machine learning project that the blogger has contacted since February 2020.Although the understanding of the project in the later part is relatively superficial, the previous code analysis part has a good reference value for beginners. This process uses the kears library, which is the most convenient and fast library in neural network. Because of its highly integrated characteristics, the novice who just got into machine learning can quickly get started and run out of this model. With a deeper understanding of deep learning, you can write a DNN model using just NUMPY operation package, then get familiar with the tensorFlow framework for model development, and gradually optimize the model using dropout Batchnorm and other mechanisms. If you are interested, you can try more frameworks, such as PyTorch, which I often use. More comfortable writing DNN and CNN.

The above is my general learning path from 2020.2 to 2021.6 for your reference

Code and analysis

# Avoid unnecessary warnings
import warnings
warnings.filterwarnings("ignore")
# Preliminary guide package
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
%matplotlib inline
# Embed the drawing into the Jupyter notebook
Copy the code

Import mnIST dataset from keras library
from keras.datasets import mnist
Copy the code

Classify the data into training sets and test sets
(X_train,y_train),(X_test,y_test)=mnist.load_data()
Print sample information
print('There are {} samples in the training set'.format(len(X_train)))
print('There are {} samples in test set'.format(len(X_test)))
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)
Copy the code

Collective training60000The total number of samples in the test set10000A sample (60000.28.28)
(60000(,)10000.28.28)
(10000.)Copy the code

plt.rcParams['font.sans-serif'] = ['SimHei']# font
plt.rcParams['axes.unicode_minus'] =False# minus
Copy the code

# Print out the image and its corresponding real tag (black and white style)
for i in range(16):
    plt.style.use({'figure.figsize': (12.12)})
    plt.subplot(1.4,i%4+1)
    plt.imshow(X_train[i],cmap='gray')
    title='The true label:{}'.format(str(y_train[i]))
    plt.title(title)
    plt.xticks([])
    plt.yticks([])
    plt.axis('off')
    if i%4= =3:
        plt.show()
Copy the code

# Computer recognition image is in the form of pixels, each pixel is composed of 0-255 gray scale, draw its thermal map (black and white)
def visualize_input(img,ax) :
    ax.imshow(img,cmap='gray')
    width,height=img.shape
    thresh=img.max(a) /2.5
    for x in range(width):
        for y in range(height):
            ax.annotate(str(round(img[x][y],2)),xy=(y,x),
                horizontalalignment='center',
                verticalalignment='center',
                color='white' if img[x][y]<thresh else 'black')
Print a digital image with an index value of 53
i=53
fig=plt.figure(figsize=(10.10))
ax=fig.add_subplot(111)
visualize_input(X_train[i],ax)
Copy the code

# Normalization of preprocessing, convert 0-255 to a value of 0-1
X_train=X_train.astype('float32') /255
X_test=X_test.astype('float32') /255
Print an element in the training set
print(X_train[0])
Copy the code

[[0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.01176471 0.07058824 0.07058824 0.07058824 0.49411765 0.53333336
  0.6862745  0.10196079 0.6509804  1.         0.96862745 0.49803922
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.11764706 0.14117648 0.36862746 0.6039216
  0.6666667  0.99215686 0.99215686 0.99215686 0.99215686 0.99215686
  0.88235295 0.6745098  0.99215686 0.9490196  0.7647059  0.2509804
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.19215687 0.93333334 0.99215686 0.99215686 0.99215686
  0.99215686 0.99215686 0.99215686 0.99215686 0.99215686 0.9843137
  0.3647059  0.32156864 0.32156864 0.21960784 0.15294118 0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.07058824 0.85882354 0.99215686 0.99215686 0.99215686
  0.99215686 0.99215686 0.7764706  0.7137255  0.96862745 0.94509804
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.3137255  0.6117647  0.41960785 0.99215686
  0.99215686 0.8039216  0.04313726 0.         0.16862746 0.6039216
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.05490196 0.00392157 0.6039216
  0.99215686 0.3529412  0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.54509807
  0.99215686 0.74509805 0.00784314 0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.04313726
  0.74509805 0.99215686 0.27450982 0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.13725491 0.94509804 0.88235295 0.627451   0.42352942 0.00392157
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.31764707 0.9411765  0.99215686 0.99215686 0.46666667
  0.09803922 0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.1764706  0.7294118  0.99215686 0.99215686
  0.5882353  0.10588235 0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.0627451  0.3647059  0.9882353
  0.99215686 0.73333335 0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.9764706
  0.99215686 0.9764706  0.2509804  0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.18039216 0.50980395 0.7176471  0.99215686
  0.99215686 0.8117647  0.00784314 0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.15294118 0.5803922  0.8980392  0.99215686 0.99215686 0.99215686
  0.98039216 0.7137255  0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.09411765 0.44705883
  0.8666667  0.99215686 0.99215686 0.99215686 0.99215686 0.7882353
  0.30588236 0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.09019608 0.25882354 0.8352941  0.99215686
  0.99215686 0.99215686 0.99215686 0.7764706  0.31764707 0.00784314
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.07058824 0.67058825 0.85882354 0.99215686 0.99215686 0.99215686
  0.99215686 0.7647059  0.3137255  0.03529412 0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.21568628 0.6745098
  0.8862745  0.99215686 0.99215686 0.99215686 0.99215686 0.95686275
  0.52156866 0.04313726 0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.53333336 0.99215686
  0.99215686 0.99215686 0.83137256 0.5294118  0.5176471  0.0627451
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]
 [0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.        ]]
Copy the code

# Preprocessed tag independent heat vector encoding
# Encoding is just a serial number, no size, need to be converted to a unique heat vector encoding
from keras.utils import np_utils
y_train=np_utils.to_categorical(y_train,10)
y_test=np_utils.to_categorical(y_test,10)
y_train[:10]
Copy the code

array([[0..0..0..0..0..1..0..0..0..0.],
       [1..0..0..0..0..0..0..0..0..0.],
       [0..0..0..0..1..0..0..0..0..0.],
       [0..1..0..0..0..0..0..0..0..0.],
       [0..0..0..0..0..0..0..0..0..1.],
       [0..0..1..0..0..0..0..0..0..0.],
       [0..1..0..0..0..0..0..0..0..0.],
       [0..0..0..1..0..0..0..0..0..0.],
       [0..1..0..0..0..0..0..0..0..0.],
       [0..0..0..0..1..0..0..0..0..0.]], dtype=float32)
Copy the code

from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten
model=Sequential()
# Stretch 28*28 into a one-dimensional vector of 784
model.add(Flatten(input_shape=(28.28)))
Build a three-layer neural network model
model.add(Dense(512,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10,activation='softmax'))
Randomly choke 20% of data in each layer to prevent overfitting
Copy the code

model.summary()
Copy the code

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_11 (Flatten)         (None.784)               0         
_________________________________________________________________
dense_34 (Dense)             (None.512)               401920    
_________________________________________________________________
dropout_24 (Dropout)         (None.512)               0         
_________________________________________________________________
dense_35 (Dense)             (None.512)               262656    
_________________________________________________________________
dropout_25 (Dropout)         (None.512)               0         
_________________________________________________________________
dense_36 (Dense)             (None.512)               262656    
_________________________________________________________________
dropout_26 (Dropout)         (None.512)               0         
_________________________________________________________________
dense_37 (Dense)             (None.10)                5130      
=================================================================
Total params: 932.362
Trainable params: 932.362
Non-trainable params: 0
_________________________________________________________________
Copy the code

model.compile(loss='categorical_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
Copy the code

model.evaluate(X_test,y_test,verbose=1)
# Untrained accuracy of around 10% is the probability of random selection from 10 numbers
Copy the code

10000/10000 [==============================] - 1s 62us/step
[2.3300374851226806.0.0908999964594841]
Only 9.09%
Copy the code

Save the most accurate values in the HDF5 file
from keras.callbacks import ModelCheckpoint
checkpointer=ModelCheckpoint(filepath='mnist.model.best.hdf5',verbose=1,save_best_only=True)
The training set was divided into 4:1, 20% of which was used for verification, and 16 rounds of training were conducted on the training data
hist=model.fit(X_train,y_train,batch_size=128,epochs=16,validation_split=0.2,callbacks=[checkpointer],verbose=1,shuffle=True)
Copy the code

Train on 48000 samples, validate on 12000 samples
Epoch 1/16
48000/48000 [==============================] - 7s 149us/step - loss: 0.2951 - accuracy: 0.9085 - val_loss: 0.1216 - val_accuracy: 0.9638

Epoch 00001: val_loss improved from inf to 0.12163, saving model to mnist.model.best.hdf5
Epoch 2/16
48000/48000 [==============================] - 6s 126us/step - loss: 0.1240 - accuracy: 0.9632 - val_loss: 0.1069 - val_accuracy: 0.9706

Epoch 00002: val_loss improved from 0.12163 to 0.10688, saving model to mnist.model.best.hdf5
Epoch 3/16
48000/48000 [==============================] - 6s 118us/step - loss: 0.0920 - accuracy: 0.9721 - val_loss: 0.0980 - val_accuracy: 0.9735

Epoch 00003: val_loss improved from 0.10688 to 0.09795, saving model to mnist.model.best.hdf5
Epoch 4/16
48000/48000 [==============================] - 6s 117us/step - loss: 0.0729 - accuracy: 0.9795 - val_loss: 0.1035 - val_accuracy: 0.9753

Epoch 00004: val_loss did not improve from 0.09795
Epoch 5/16
48000/48000 [==============================] - 6s 119us/step - loss: 0.0657 - accuracy: 0.9814 - val_loss: 0.1062 - val_accuracy: 0.9758

Epoch 00005: val_loss did not improve from 0.09795
Epoch 6/16
48000/48000 [==============================] - 6s 120us/step - loss: 0.0552 - accuracy: 0.9844 - val_loss: 0.1136 - val_accuracy: 0.9779

Epoch 00006: val_loss did not improve from 0.09795
Epoch 7/16
48000/48000 [==============================] - 6s 127us/step - loss: 0.0527 - accuracy: 0.9856 - val_loss: 0.1142 - val_accuracy: 0.9787

Epoch 00007: val_loss did not improve from 0.09795
Epoch 8/16
48000/48000 [==============================] - 6s 120us/step - loss: 0.0460 - accuracy: 0.9877 - val_loss: 0.1206 - val_accuracy: 0.9780

Epoch 00008: val_loss did not improve from 0.09795
Epoch 9/16
48000/48000 [==============================] - 6s 121us/step - loss: 0.0463 - accuracy: 0.9878 - val_loss: 0.1515 - val_accuracy: 0.9736

Epoch 00009: val_loss did not improve from 0.09795
Epoch 10/16
48000/48000 [==============================] - 6s 120us/step - loss: 0.0444 - accuracy: 0.9889 - val_loss: 0.1232 - val_accuracy: 0.9778

Epoch 00010: val_loss did not improve from 0.09795
Epoch 11/16
48000/48000 [==============================] - 6s 117us/step - loss: 0.0412 - accuracy: 0.9892 - val_loss: 0.1682 - val_accuracy: 0.9801

Epoch 00011: val_loss did not improve from 0.09795
Epoch 12/16
48000/48000 [==============================] - 6s 118us/step - loss: 0.0403 - accuracy: 0.9908 - val_loss: 0.1517 - val_accuracy: 0.9811

Epoch 00012: val_loss did not improve from 0.09795
Epoch 13/16
48000/48000 [==============================] - 6s 117us/step - loss: 0.0368 - accuracy: 0.9904 - val_loss: 0.1541 - val_accuracy: 0.9787

Epoch 00013: val_loss did not improve from 0.09795
Epoch 14/16
48000/48000 [==============================] - 6s 121us/step - loss: 0.0396 - accuracy: 0.9904 - val_loss: 0.1681 - val_accuracy: 0.9766

Epoch 00014: val_loss did not improve from 0.09795
Epoch 15/16
48000/48000 [==============================] - 6s 119us/step - loss: 0.0374 - accuracy: 0.9913 - val_loss: 0.1767 - val_accuracy: 0.9799	

Epoch 00015: val_loss did not improve from 0.09795
Epoch 16/16
48000/48000 [==============================] - 6s 117us/step - loss: 0.0371 - accuracy: 0.9924 - val_loss: 0.2129 - val_accuracy: 0.9808

Epoch 00016: val_loss did not improve from 0.09795

Copy the code

Load the model from the highest precision file
model.load_weights('mnist.model.best.hdf5')
Copy the code

Feed the model test set data
model.evaluate(X_test,y_test,verbose=0)
Copy the code

[0.08776235084307846.0.9739999771118164]
Copy the code

# Map loss values and accuracy as training progresses
def plot_history(network_history) :
    plt.figure()
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.plot(network_history.history['loss'])
    plt.plot(network_history.history['val_loss'])
    plt.legend(['Training'.'Validation'],loc='lower right')
    plt.rcParams['figure.figsize'] = (8.0.4.0) 
    plt.show()
    
    plt.figure()
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.plot(network_history.history['accuracy'])
    plt.plot(network_history.history['val_accuracy'])
    plt.legend(['Training'.'Validation'],loc='lower right')
    plt.rcParams['figure.figsize'] = (8.0.4.0) 
    plt.show()
    
plot_history(hist)
Copy the code

Both loss (or cost) and accuracy converge to a stable value with the increase of iteration times, and the maximum value of recording accuracy is a key point to read into the model. Well, yes!

# Forecast data with index value 8
i=8
plt.imshow(X_test[i])
img_test=X_test[i].reshape(-1.28.28)
prediction=model.predict(img_test)[0]
title='The true label:{}\nThe predicted label:{}'.format(np.argmax(y_test[i]),np.argmax(prediction))
plt.title(title)
plt.rcParams['figure.figsize'] = (8.0.4.0) 
plt.show()

plt.bar(range(10),prediction)
plt.title('The possibility of prediction')
plt.xticks([0.1.2.3.4.5.6.7.8.9])
plt.show()
Copy the code

# forecast data with index value 1264
i=1264
plt.imshow(X_test[i])
img_test=X_test[i].reshape(-1.28.28)
prediction=model.predict(img_test)[0]
title='The true label:{}\nThe predicted label:{}'.format(np.argmax(y_test[i]),np.argmax(prediction))
plt.title(title)
plt.rcParams['figure.figsize'] = (8.0.4.0) 
plt.show()

plt.bar(range(10),prediction)
plt.title('The possibility of prediction')
plt.xticks([0.1.2.3.4.5.6.7.8.9])
plt.show()
Copy the code

# Use tensorboard in TensorFlow to visualize the model
import tensorflow as tf
writer=tf.summary.FileWriter('./log/', tf.get_default_graph())
writer.close()
Copy the code

To start the Web server from the url, enter the following code in CMD
tensorboard --logdir=./log
Copy the code

Considerable thinking

The fully connected neural network expands the vector of 28×28 into a multilayer perceptron with 784 bits of vector length. Neurons in each layer are connected to neurons in another layerRecognition principle

Nonlinear dimensionality reduction of data using T-SNE in three dimensional space far in two dimensional space far in three dimensional space near in two dimensional space near in three dimensional space! The following figure shows the 50th iteration process:

The following figure shows the 100th iteration:

The following figure shows the process of more than 300 iterations in two-dimensional space:

Observation shows that 4, 9, 7 have high similarity and 1, 7 are easily confused…… (Too many conclusions, not to mention)Zooming in on the image, you can see that 3, 5, 7, and 4 are very similar in writing and easy to confuse!

TIPS

Data set: label set for storing specific image pixels: index for storing image corresponding labels: Advantages: Using the TensorFlow Keras framework, it is easy to import mnIST datasets and train using kerAS highly encapsulated models. Suitable for entry, do not need to start to write model code details, convenient data processing disadvantages: the framework and model are static, there is no good tunability and dynamic. This process can be trained with PyTorch, but the code is more difficult and requires a lot of hand-writing, so the next step is to learn to train and predict the model using PyTorch (without calling Keras)…… (To be continued……)

The pictures from Google high dimensional visualization open source web tools Embedding Projector:projector.tensorflow.org/

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — 2020.1.20

Keras is used to construct a simple Mnist fully connected neural network

Code and analysis

Considerable thinking

TIPS

Related Posts

Keypoint detection project code is open source!

Optimization solution based on matlab genetic algorithm to find the shortest path

Machine Learning Exercise 6: SKLearn support Vector Machines (SVM)