Make writing a habit together! This is the 7th day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

Initial experience of constructing neural networks using Keras

In this section, we will learn how to create a neural network model using Keras. Using the same simple data set as in learning neural network forward propagation-nuggets (juejin.cn) from scratch, we define the model as follows:

  • The input is connected to a hidden layer with three nodes
  • The hidden layer is connected to the output layer, which has a node
  1. Define data set and import related library:
import keras
import numpy as np
x = np.array([[1], [2], [3], [7]])
y = np.array([[3], [6], [9], [21]])
Copy the code
  1. To instantiate a neural network model that can be calculated sequentially, multiple network layers can be added in a stack. The calculation process is carried out according to the stack order of network layers.SequentialThe method can construct the sequential computation model:
model = keras.models.Sequential()
Copy the code
  1. Add one to the modelDenseLayer (full connection layer).DenseLayers are used for full connections between layers in the model (each node in the upper layer is connected to each node in the layer),DenseLayers work the way we do inLearning neural Network Forward Propagation from scratch – Nuggets (juejin. Cn)The same hidden layer used in the. In the following code, we connect the input layer to the hidden layer:
model.add(Dense(3, activation='relu', input_shape=(1))),Copy the code

In the Dense layer initialized with the previous code, you need to make sure you provide input shapes for the model (since this is the first fully connected layer, you need to specify the data shapes that the model expects to receive). There are three nodes in the hidden layer and the activation function used in the hidden layer is the ReLU function.

  1. Connect the hidden layer to the output layer:
model.add(keras.layers.Dense(1, activation='linear'))
Copy the code

In this Dense layer, we do not need to specify the input shape because the model can infer the input shape from the previous layer. The output layer has a node and uses a linear activation function.

The model summary can be visualized as follows:

model.summary()
Copy the code

You can see the model profile as follows:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None.3)                 6         
_________________________________________________________________
dense_1 (Dense)              (None.1)                 4         
=================================================================
Total params: 10
Trainable params: 10
Non-trainable params: 0
_________________________________________________________________
Copy the code

As you can see from the model profile, there are a total of six parameters (three weights and three offset items) in the connection from the input layer to the hidden layer, and three weights and one offset item are used to connect the hidden layer to the output layer.

  1. Compile the model. First, we need to define the loss function and the optimizer, as well as the learning rate corresponding to the optimizer:
from keras.optimizers import SGD
sgd = SGD(lr=0.01)
Copy the code

The code above specifies that the optimizer is stochastic gradient descent with a learning rate of 0.01. The predefined optimizer and its corresponding learning rate and loss functions are passed as parameters to the compile method compilation model:

model.compile(optimizer=sgd,loss='mean_squared_error')
Copy the code
  1. Fitting model. Update weights to optimize the model:
model.fit(x, y, epochs=1, batch_size = 4, verbose=1)
Copy the code

The fit method needs to receive an input x and the corresponding actual value y. Epochs represents the number of training data sets, batch_size represents the amount of training data in each iteration of updating weights, verbose specifies the output information during training. You can include information about lost values on training and test datasets and the progress of model training.

  1. Extract weight values. Information about weight values is obtained by invoking the modelweightsProperty acquired:
model.weights
Copy the code

The weight information obtained is as follows:

[<tf.Variable 'dense/kernel:0' shape=(1, 3) DType = FLOAT32, NUMPY =array([1.1533519, 1.2411805, 0.39152434]], Dtype =float32)>, <tf.Variable 'dense/bias:0' shape=(3,) dtype=float32, numpy=array([0.03425962, -0.05432956, -0.1607531], dType =float32)>, <tf.Variable 'dense_1/kernel:0' shape=(3, 1) dType =float32, numPY =array([[1.2210085], [1.2086679],[0.21541257]], dType =float32)>, <tf.Variable 'denSE_1 / BIAS :0' shape=(1,) dType =float32, Numpy = array ([0.09131978], dtype = float32) >]Copy the code

From the previous output, you can see that the weights printed first belong to the three weights and three offset items in the denSE_1 layer, followed by the three weights and one offset item in the dense_2 layer. It includes the size of the weight, the data type and the specific value of the parameter. We can also just extract the values of these weights:

print(model.get_weights())
Copy the code

Weights are shown as an array list, where each array corresponds to the corresponding item in the model.weights output:

[array([[1.1533519, 1.2411805, 0.39152434]], DType = FLOAT32), Array ([0.03425962, -0.05432956, -0.1607531], Dtype = Float32), Array ([[1.2210085], [1.2086679], [0.21541257]], DType = Float32), Array ([0.09131978], DType = Float32)]Copy the code
  1. usepredictMethod to predict the output of a new set of inputs:
x1 = [[5], [6]]
output = model.predict(x1)
print(output)
Copy the code

X1 is the variable that holds the value of the new test set, for which we need to predict the output. Like the FIT method, the Predict method takes an array as its input. The output of the code looks like this:

[[14.996691]
 [17.989458]]
Copy the code

When training multiple Epochs, the output of the network will be very close to the expected output (15, 18).

Related links

Learning neural Network Forward Propagation from scratch – Nuggets (juejin. Cn)

Learning neural Network Back Propagation from scratch – Digging gold (juejin. Cn)