The Keras Python library makes deep learning models quick and easy to create. The sequential API allows you to create a layer by layer model for most problems. It is limited in that it does not allow you to create shared layers or models with multiple inputs or outputs. Functional apis in Keras are an alternative to model creation, providing greater flexibility, including creating more complex models.

In this tutorial, you’ll discover how to define deep learning models in Keras using more flexible functional apis. After completing this tutorial, you will know:

  • The difference between sequential and functional apis.
  • How to use functional apis to define simple multilayer perceptrons, convolutional neural networks and recursive neural network models.
  • How to define more complex models with shared layers and multiple inputs and outputs.

Tutorial overview

This tutorial is divided into seven parts. They are:

  • Keras sequential model
  • Keras functional model
  • Standard network model
  • Shared layer model
  • Multiple input and output models
  • Best practices
  • New: Explanation of Functional API Python syntax

Keras order model

As a review, Keras provides a sequential model API. If you are not familiar with Keras or deep learning, please refer to this step-by-step Keras tutorial. The sequential model API is a way to create a deep learning model in which instances of sequential classes are created and model layers are created and added to the model. For example, you can define layers and pass them as an array to Sequential:

from keras.models import Sequential
from keras.layers import Dense
model = Sequential([Dense(2, input_dim=1), Dense(1)])
Copy the code

Layers can also be added in sections:

from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(2, input_dim=1))
model.add(Dense(1))
Copy the code

The sequential model API is great for developing deep learning models in most situations, but it has some limitations. For example, it is not easy to define a model that may have multiple input sources, produce multiple output targets, or reuse layers.

Ii. Keras functional model

The Keras functionality API provides a more flexible way to define models. In particular, it allows you to define multiple input or output models and models that share layers. Not only that, it also allows you to define temporary acyclic network diagrams. You specify the model by creating layer instances and connecting them directly to each other in pairs, and then defining a model that specifies these layers as inputs and outputs to the model.

Let’s take a look at three unique aspects of the Keras functionality API in turn:

1. Define the input

Unlike the sequential model, you must create and define a separate input layer to specify the shape of the input data. The input layer takes the shape parameter, which is a tuple indicating the dimension of the input data. If the input data is one-dimensional (such as a multilayer perceptron), the shape must explicitly leave room for the shape of the minimum batch size to use when training the network to split the data. Thus, when the input is one-dimensional (2,), the shape tuple is always defined with the last hanging dimension, for example:

from keras.layers import Input
visible = Input(shape=(2)),Copy the code

2. The connection layer

Layers in the model are connected in pairs. This is done by specifying where the input comes from when defining each new layer. Use parentheses notation to specify the source layer for the current layer input after the layer is created. Let’s illustrate this point with a brief example. We can create an input layer as described above, and then create a hidden layer as Dense that receives input only from the input layer.

from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)
Copy the code

Note the (visible) layer after creating the dense layer that connects the input layer output as the dense layer. It is this layer-by-layer approach that gives functional apis flexibility. For example, you can see how easy it is to start defining temporary graphics for layers.

3. Create the model

After you have created all the model layers and wired them together, you must define the model. Like Sequential apis, a model is something that you can aggregate, fit, evaluate, and use to make predictions. Keras provides a Model class that you can use to create models from layers created. It only needs to specify the input and output layers. Such as:

from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)
model = Model(inputs=visible, outputs=hidden)
Copy the code

Now that we know all the key parts of the Keras functionality API, let’s define a different set of models and put them into practice. Each example is executable and prints the structure and creates the graph of the graph. I recommend that you do this for your own model to clearly define the content. I hope these examples provide you with templates in the future when you want to define your own models using functional apis.

Standard network model

To get started with functional apis, it is good to know how to define some standard neural network models. In this section, we will study the definition of a simple multilayer perceptron, convolutional neural network and recursive neural network. These examples will provide a foundation for understanding more detailed examples later.

  • Multilayer perceptron

In this section, we define a multi-layer perceptron model for binary classification. The model has 10 inputs, three hidden layers with 10, 20, and 10 neurons, and one output layer with one output. Linear activation functions for rectification are used for each hidden layer, while S-shaped activation functions are used for the output layer for binary classification.

# Multilayer Perceptron
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
visible = Input(shape=(10,))
hidden1 = Dense(10, activation='relu')(visible)
hidden2 = Dense(20, activation='relu')(hidden1)
hidden3 = Dense(10, activation='relu')(hidden2)
output = Dense(1, activation='sigmoid')(hidden3)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multilayer_perceptron_graph.png')
Copy the code

The output is as follows:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 10)                0
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110
_________________________________________________________________
dense_2 (Dense)              (None, 20)                220
_________________________________________________________________
dense_3 (Dense)              (None, 10)                210
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 11
=================================================================
Total params: 551
Trainable params: 551
Non-trainable params: 0
_________________________________________________________________
Copy the code

The network structure diagram is as follows:

  • Convolutional neural network

In this section, we will define a convolutional neural network for image classification. The model receives black and white 64×64 images as input, then has two convolution and pooling layers as sequences of feature extractors, then a fully connected layer to interpret features, and an S-activated output layer to make two kinds of predictions.

# Convolutional Neural Network
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
visible = Input(shape=(64.64.1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2.2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2.2))(conv2)
flat = Flatten()(pool2)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='convolutional_neural_network.png')
Copy the code

The output is as follows:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 64.64.1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 61.61.32)        544       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 30.30.32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 27.27.16)        8208      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13.13.16)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 2704)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                27050     
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 11        
=================================================================
Total params: 35.813
Trainable params: 35.813
Non-trainable params: 0
_________________________________________________________________
Copy the code

The network structure diagram is as follows:

  • Recursive neural network

In this section, we will define a long – and short-term memory recursive neural network for sequence classification. The model expects 100 time steps for a function as inputs. The model has a single LSTM hiding layer to extract features from the sequence, followed by a fully connected layer to interpret the LSTM output, followed by an output layer for binary prediction.

# Recurrent Neural Network
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
visible = Input(shape=(100.1))
hidden1 = LSTM(10)(visible)
hidden2 = Dense(10, activation='relu')(hidden1)
output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='recurrent_neural_network.png')
Copy the code

The output is as follows:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         (None, 100.1)            0
_________________________________________________________________
lstm_1 (LSTM)                (None, 10)                480
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 11
=================================================================
Total params: 601
Trainable params: 601
Non-trainable params: 0
_________________________________________________________________
Copy the code

The model structure diagram is as follows:

4. Sharing layer model

Multiple layers can share the output of one layer. For example, there might be multiple different feature extraction layers from the input, or there might be multiple layers to interpret the output from the feature extraction layer. Let’s look at two examples.

  • Shared input layer

In this section, we define multiple convolution layers with different size kernels to interpret the image input. The model will take black and white images with a size of 64×64 pixels. There are two CNN feature extraction submodels that share this input. The first has a kernel size of 4 and the second has a kernel size of 8. The outputs of these feature extraction sub-models are flattened into vectors and connected into a long vector, which is passed to the fully connected layer for interpretation. Finally, the output layer completes binary classification.

# Shared Input Layer
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# input layer
visible = Input(shape=(64.64.1))
# first feature extractor
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2.2))(conv1)
flat1 = Flatten()(pool1)
# second feature extractor
conv2 = Conv2D(16, kernel_size=8, activation='relu')(visible)
pool2 = MaxPooling2D(pool_size=(2.2))(conv2)
flat2 = Flatten()(pool2)
# merge feature extractors
merge = concatenate([flat1, flat2])
# interpretation layer
hidden1 = Dense(10, activation='relu')(merge)
# prediction output
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='shared_input_layer.png')
Copy the code

The output is as follows:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 64.64.1)     0
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (None, 61.61.32)    544         input_1[0] [0] ____________________________________________________________________________________________________ conv2d_2 (Conv2D)  (None,57.57.16)    1040        input_1[0] [0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)   (None, 30.30.32)    0           conv2d_1[0] [0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)   (None, 28.28.16)    0           conv2d_2[0] [0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 28800)         0           max_pooling2d_1[0] [0]
____________________________________________________________________________________________________
flatten_2 (Flatten)              (None, 12544)         0           max_pooling2d_2[0] [0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 41344)         0           flatten_1[0] [0]
                                                                   flatten_2[0] [0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            413450      concatenate_1[0] [0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 1)             11          dense_1[0] [0]
====================================================================================================
Total params: 415.045
Trainable params: 415.045
Non-trainable params: 0
____________________________________________________________________________________________________
Copy the code

The network structure diagram is as follows:

  • Shared feature extraction layer

In this section, we will use two parallel submodels to interpret the output of an LSTM feature extractor for sequence classification. The inputs to the model are 100 time steps for 1 feature. The LSTM layer with 10 storage units interprets this sequence. The first interpretation model is a shallow single fully connected layer, and the second interpretation is a deep 3-layer model. The outputs of both interpretation models are connected to a long vector that is passed to the output layer for binary prediction.

# Shared Feature Extraction Layer
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.merge import concatenate
# define input
visible = Input(shape=(100.1))
# feature extraction
extract1 = LSTM(10)(visible)
# first interpretation model
interp1 = Dense(10, activation='relu')(extract1)
# second interpretation model
interp11 = Dense(10, activation='relu')(extract1)
interp12 = Dense(20, activation='relu')(interp11)
interp13 = Dense(10, activation='relu')(interp12)
# merge interpretation
merge = concatenate([interp1, interp13])
# output
output = Dense(1, activation='sigmoid')(merge)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='shared_feature_extractor.png')
Copy the code

The output is as follows:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 100.1)        0
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 10)            480         input_1[0] [0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 10)            110         lstm_1[0] [0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 20)            220         dense_2[0] [0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            110         lstm_1[0] [0]
____________________________________________________________________________________________________
dense_4 (Dense)                  (None, 10)            210         dense_3[0] [0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 20)            0           dense_1[0] [0]
                                                                   dense_4[0] [0]
____________________________________________________________________________________________________
dense_5 (Dense)                  (None, 1)             21          concatenate_1[0] [0]
====================================================================================================
Total params: 1.151
Trainable params: 1.151
Non-trainable params: 0
____________________________________________________________________________________________________
Copy the code

The network structure diagram is as follows:

5. Multiple input and output models

Functional apis can also be used to develop more complex models with multiple inputs, possibly with different patterns. It can also be used to develop models that produce multiple outputs. We’ll look at each example in this section.

  • Multiple input model

We will develop an image classification model that takes as input two versions of images, each of different sizes. Especially the black and white 64×64 and color 32×32 versions. Feature extraction was carried out for each CNN model, and then the results of the two models were connected for interpretation and final prediction.

Note that when we created the Model () instance, we defined the two input layers as an array. Special:

model = Model(inputs=[visible1, visible2], outputs=output)
Copy the code

Examples are as follows:

# Multiple Inputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# first input model
visible1 = Input(shape=(64.64.1))
conv11 = Conv2D(32, kernel_size=4, activation='relu')(visible1)
pool11 = MaxPooling2D(pool_size=(2.2))(conv11)
conv12 = Conv2D(16, kernel_size=4, activation='relu')(pool11)
pool12 = MaxPooling2D(pool_size=(2.2))(conv12)
flat1 = Flatten()(pool12)
# second input model
visible2 = Input(shape=(32.32.3))
conv21 = Conv2D(32, kernel_size=4, activation='relu')(visible2)
pool21 = MaxPooling2D(pool_size=(2.2))(conv21)
conv22 = Conv2D(16, kernel_size=4, activation='relu')(pool21)
pool22 = MaxPooling2D(pool_size=(2.2))(conv22)
flat2 = Flatten()(pool22)
# merge input models
merge = concatenate([flat1, flat2])
# interpretation model
hidden1 = Dense(10, activation='relu')(merge)
hidden2 = Dense(10, activation='relu')(hidden1)
output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs=[visible1, visible2], outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_inputs.png')
Copy the code

The output is as follows:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 64.64.1)     0
____________________________________________________________________________________________________
input_2 (InputLayer)             (None, 32.32.3)     0
____________________________________________________________________________________________________
conv2d_1 (Conv2D)                (None, 61.61.32)    544         input_1[0] [0] ____________________________________________________________________________________________________ conv2d_3 (Conv2D)  (None,29.29.32)    1568        input_2[0] [0]
____________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)   (None, 30.30.32)    0           conv2d_1[0] [0]
____________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)   (None, 14.14.32)    0           conv2d_3[0] [0] ____________________________________________________________________________________________________ conv2d_2 (Conv2D)  (None,27.27.16)    8208        max_pooling2d_1[0] [0] ____________________________________________________________________________________________________ conv2d_4 (Conv2D)  (None,11.11.16)    8208        max_pooling2d_3[0] [0]
____________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)   (None, 13.13.16)    0           conv2d_2[0] [0]
____________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)   (None, 5.5.16)      0           conv2d_4[0] [0]
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 2704)          0           max_pooling2d_2[0] [0]
____________________________________________________________________________________________________
flatten_2 (Flatten)              (None, 400)           0           max_pooling2d_4[0] [0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate)      (None, 3104)          0           flatten_1[0] [0]
                                                                   flatten_2[0] [0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            31050       concatenate_1[0] [0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 10)            110         dense_1[0] [0]
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 1)             11          dense_2[0] [0]
====================================================================================================
Total params: 49.699
Trainable params: 49.699
Non-trainable params: 0
____________________________________________________________________________________________________
Copy the code

The network structure diagram is as follows:

  • Multiple output model

In this section, we will develop a model that makes two different types of predictions. Given an input sequence of 100 time steps for a feature, the model classifies the sequence and outputs a new sequence of the same length. The LSTM layer interprets the input sequence and returns a hidden state for each time step. The first output model creates stacked LSTM, interprets features, and makes binary predictions. The second output model uses the same output layer for a real-value prediction for each input time step.

# Multiple Outputs
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import TimeDistributed
# input layer
visible = Input(shape=(100.1))
# feature extraction
extract = LSTM(10, return_sequences=True)(visible)
# classification output
class11 = LSTM(10)(extract)
class12 = Dense(10, activation='relu')(class11)
output1 = Dense(1, activation='sigmoid')(class12)
# sequence output
output2 = TimeDistributed(Dense(1, activation='linear'))(extract)
# output
model = Model(inputs=visible, outputs=[output1, output2])
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='multiple_outputs.png')
Copy the code

The output is as follows:

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to
====================================================================================================
input_1 (InputLayer)             (None, 100.1)        0
____________________________________________________________________________________________________
lstm_1 (LSTM)                    (None, 100.10)       480         input_1[0] [0]
____________________________________________________________________________________________________
lstm_2 (LSTM)                    (None, 10)            840         lstm_1[0] [0]
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 10)            110         lstm_2[0] [0]
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 1)             11          dense_1[0] [0]
____________________________________________________________________________________________________
time_distributed_1 (TimeDistribu (None, 100.1)        11          lstm_1[0] [0]
====================================================================================================
Total params: 1.452
Trainable params: 1.452
Non-trainable params: 0
____________________________________________________________________________________________________
Copy the code

The network structure diagram is as follows:

Vi. Best Practices

In this section, I want to give you some tips for taking advantage of functional apis when defining your own models.

  • Consistent variable names. Both the input (visible) layer and the output layer (output) and even the hidden layer (hidden1.hidden2) use the same variable name. It will help tie things together properly.
  • View the layer summary. Always print the model summary and look at the layer output to make sure the models are connected as expected.
  • View graphic diagram. Always create a diagram of the model diagram and check to make sure everything is coming together as expected.
  • Name the layer. You can assign names to the layers used to view the summary of the model diagram and draw it. Such as:Dense (1, name = 'hidden1').
  • Separate submodels. Consider the development of separate submodels and finally put them together.

Author: Yishui Hancheng, CSDN blog expert, personal research interests: machine learning, deep learning, NLP, CV

Blog: yishuihancheng.blog.csdn.net

Appreciate the author

Read more

Top 10 Best Popular Python Libraries of 2020 \

2020 Python Chinese Community Top 10 Articles \

5 minutes to quickly master the Python timed task framework \