From Data Science Central

Heart of the machine compiles

Participation: Jiang Siyuan



In this paper, we compare the performance of major deep learning frameworks supported by Keras, including TensorFlow, CNTK, MXNet and Theano. We hope to test the performance of different frameworks for different types of tasks by using the same model and different Keras backends. In this paper, MLP, CNN and RNN models are tested through five tasks respectively. Heart of Machine not only introduces this experiment, but also uses Keras (TensorFlow back-end) to run CNN on MNIST dataset.
If there’s any doubt about the popularity of Keras for data science and deep learning, consider the support for all major cloud platforms and deep learning frameworks. Keras supports TensorFlow from Google, CNTK from Microsoft, Theano from the University of Montreal, and AWS announced last year that Keras will support Apache MXNet. MXNet 0.11, released last month, added support for Core ML and Keras V1.2. However, so far MXNet only seems to support Keras V1.2.2 rather than the latest version 2.0.5.

While we can use any of the back-end deployment models supported by Keras, developers and solution architects should be aware that Keras serves as a high-level API for various deep learning libraries, which in essence does not support all of the basic parameter fine-tuning provided by each library. So if we want to fine-tune all the parameters provided by the backend framework, we are better off using the deep learning framework directly rather than Keras. , of course, this situation will be added with various tools to Keras and deep learning framework and improved, but now Keras is still a very good tool, it can perfectly adapted to the development of deep learning in the early stages of engineers and scientists for the data and algorithm to quickly build and test the depth of the complex learning model provides a powerful tool.

Heart of Machine also tested Keras using TensorFlow as the back end, and we found that the whole model was very simple to build, and even beginners could easily read the entire network architecture. It is much easier to use Keras as a high-level API and TensorFlow as a back end than to build convolutional neural networks using TensorFlow directly. Later, we will upload the code and comments of Keras implementing CNN to the GitHub project of Heart of Machine. The following picture shows the situation of using TensorFlow as the backend initialization training:

The following is the architecture of the whole convolutional network:



The code above clearly defines the hierarchy used to overlay the entire network. Sequential stands for Sequential model, which is a linear stack of multiple network layers. After establishing the sequential model, we can build the entire network by adding different layers starting with the input layer. In the above architecture, 2d convolution layer Conv2D is firstly used, the size of convolution kernel is 3*3, and the activation function is ReLU, where the first parameter 32 represents the number of convolution kernels. In addition, the convolutional network also uses the maximum pooling layer MaxPooling2D, where pool_size=(2,2) is the down-sampling factor in two directions (vertical and horizontal). The Dropout layer, which randomly disconnects input neurons with each parameter update with a probability of 0.25; Dense layer, i.e., fully connected layer; There is also the Flatten layer, which means the input is “flattened”, that is, the multidimensional input is one-dimensional, and is often used in the transition from the convolution layer to the fully connected layer. This is the basic level of the architecture. For more detailed code and comments, see the Heart of The Machine GitHub project.

Here are the details of Jasmeet Bhatia’s assessment.



Keras backend framework performance test

Keras also enables developers to quickly test the relative performance of using different deep learning frameworks as Keras backends. There is a parameter in the Keras configuration file that determines which deep learning framework to use as the back end, so we can build the same model to run directly on different deep learning frameworks (e.g. TensorFlow, CNTK, Theano). For MXNet, only Keras Ver1.2.2 is currently supported, so we need to make a few changes to the code. Of course, the model can be fine-tuned for better performance based on the different libraries in each deep learning framework, but Keras still provides a good opportunity to compare the performance of these base libraries.

There have been several previous articles comparing the relative performance of keras-supported back-end frameworks, but the comparisons have been relatively early, with TensorFlow and Theano as the main backends. So this article compares Keras with the latest version of the Deep learning framework on a larger scale.

Let’s start by looking at the configuration for testing. All performance tests were performed on The Azure NC6 VM using the Nvidia Tesla K80 GPU, using a VM image of the Azure DSVM (Data Science Virtual Machine) on Ubuntu. In addition to other data science tools, we have pre-installed Keras, TensorFlow, Theano, and MXNet. For testing, all packages are the latest version, but because MXNet only supports Keras 1.2.2, it uses older versions.



configuration

Because each deep learning framework has different dependencies, our tests run in one of three configurations:



The performance test

To compare the different performance of the DL framework, we used five different test models as described below. To ensure that no particular framework gets any particular treatment, all models are from those maintained in the GitHub Keras/ Examples repository.

  • Model of source address: https://github.com/fchollet/keras/tree/master/examples
  • Test the code can be found in the author’s making project: https://github.com/jasmeetsb/deep-learning-keras-projects
Note: MXNet was not involved in two of the tests because MXNet does not support the latest version of Keras, and running the model as a back end requires a lot of code tweaking. The use of MXNet as the back end in the other three tests also required some minor adjustments, mainly the renaming of some functions in the new version of Keras.

Test one: CIFAR-10 & CNN

  • Types of learning models: Convolutional Neural Network (CNN)
  • Data set/Task: CIFAR-10 small image data set
  • Objective: To categorize images into 10 categories
According to the training rate of each epoch, TensorFlow is slightly faster than MXNet.

In terms of accuracy/convergence rate, CNTK was a little ahead of the top 25 epochs, while after 50 epochs, all other frameworks reached similar accuracy, while CNTK slightly decreased.

Test two: MNIST & CNN

  • Type of learning model: CNN
  • Dataset/task: MNIST handwritten digital dataset
  • Objective: To classify images into 10 categories of handwritten numbers
In this test, TensorFlow was clearly superior in terms of training time, but all frameworks had similar characteristics in terms of accuracy/convergence speed.

Test three: MNIST&MLP

  • Types of learning models: multilayer perceptron/deep neural network
  • Dataset/task: MNIST handwritten digital dataset
  • Objective: To classify images into 10 categories of handwritten numbers
In performing standard neural network tests using MNIST datasets, CNTK, TensorFlow, and Theano achieved similar fractions (2.5 — 2.7 s/epoch), while MXNet required only 1.4s/epoch. In addition, MXNet also has a slight advantage in accuracy/speed of convergence.

Test 4: MNIST&RNN

  • Types of Learning Models: Hierarchical Recurrent Neural Networks (HRNN)
  • Dataset/task: MNIST handwritten digital dataset
  • Objective: To classify images into 10 categories of handwritten numbers
In terms of training time, CNTK and MXNet had similar performance (162-164 s/epoch), TensorFlow had 179s/epoch, and Theano required significantly more time.

Test 5: BABI & RNN

  • Types of learning models: Recurrent Neural Network (RNN)
  • Data set/tasks: bAbi Project (https://research.fb.com/downloads/babi/)
  • Objective: To train two recurrent neural networks based on a story and a question, respectively, so that the combined vector can answer a series of bAbi tasks.
This test did not use MXNet, and TensorFlow and Theano were more than twice as slow as CNTK on each epoch.

conclusion



  • TensorFlow performed best in CNN tests, but not so well in RNN tests.
  • CNTK is much better than TensorFlow and Theano in Babi RNN and MNIST RNN tests, but worse than TensorFlow in CNN tests.
  • MXNet performs slightly better than CNTK and TensorFlow on RNN tests, and performs better than all frameworks on MLP. However, MXNet does not support Keras V2 functions, so we could not directly test it without fixing the code, so there might be a slight deviation.
  • Theano is a bit better than TensorFlow and CNTK in deep Neural networks (MLP).
As can be seen from the above results, all deep learning frameworks have their own areas of expertise, and no one framework is necessarily better than the others. CNTK can be used as a Keras backend for RNN use cases, TensorFlow can be used for CNN, and MXNet, while showing very high performance potential, is still expected to support all Keras functions. In the open source community, these frameworks are being extended and enhanced to provide better performance and easily deployed into products. When considering these deep learning frameworks for production, performance is Paramount. In most cases, we also need to consider ease of deployment and other ancillary tools that will help us manage our productized machine learning model. Finally, the performance of all frameworks is measured as a Keras backend, so there is a bit of a margin of error, but this article will at least give you some idea of the performance of these frameworks. In addition, this article can give some relatively objective suggestions when you use different backends.



The original link: http://www.datasciencecentral.com/profiles/blogs/search-for-the-fastest-deep-learning-framework-supported-by-keras