Francois Chollet, inventor of the Keras framework, is a great fan of Francois Chollet’s recommended learning methods

Therefore, I translated the article recommended by Francois Chollet and shared it with you

Introduction to keras¶

A few months ago, I started a new study at Fast Forward’s lab. Having completed an image recognition project with convolutional neural networks, my new colleagues are starting a new project: text summaries using recursive neural networks (RNN). I’ve never worked with a neural network before, so I have to learn fast.

Despite the recent excitement about deep learning, neural networks have a bad reputation in the eyes of non-professionals, who seem to be computation-intensive training tasks that are hard to explain and hard to create

This is exactly how I feel, and while it’s hard to explain, and there’s still a desire to use powerful hardware to train and run large neural networks, building and exploring neural networks has recently become much easier

The reason is that a high-level NEURAL network library allows developers to quickly build neural network models without worrying about the numerical details of floating point operations, tensor algebra, and GPU programming

Today we are going to walk into Keras. Keras is a high-level neural network library that contains a lot of stuff, encapsulates an API similar to Scikit-Learn, and uses Either Theano or TensorFlow in the back end

Because of the similarity between Keras and Scikit-Learn, I’ll give you a tutorial on using Keras by comparing it to Scikit-Learn

Scikit-learn is the most popular, full-featured and classic machine learning library in the Python community. Among its many benefits, I like its simple, consistent, and consistent API, which is built around Estimator objects. These apis describe the machine learning workflows that are familiar to most engineers, and they are consistent throughout the library

Next, let’s start importing the libraries we need: Scikit-Learn, Keras and, and other drawing tools.

%matplotlib inline import matplotlib.pyplot as plt import seaborn as sns import numpy as np from sklearn.cross_validation import train_test_split from sklearn.linear_model import LogisticRegressionCV from keras.models Import Sequential # Keras==1.0.7 from keras.layers. Core import Dense, Activation from keras.utils import np_utils # Using Theano backend.Copy the code
/ Users/WWJ/env/lib/python2.7 / site - packages/matplotlib/set py: 872: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter. warnings.warn(self.msg_depr % (key, alt_key)) Using Theano backend.Copy the code

Iris Data¶

The famous Iris Dataset (published by Ronald Fisher in 1936) is a great way to demonstrate machine learning framework apis. In a sense, it’s the Hello World of machine learning

The data set is simple and it is possible to obtain a simple classification on it with high accuracy. To use a neural network to deal with this problem is to kill a chicken and kill a bull. But it’s fun! Our goal is to explore the code needed to categorize this data, rather than the details of model design and selection

The IRIS data set is built into many machine learning libraries, and I like to get it from Seaborn, where it is easy to visualize as a dataframe with a tag. We’re going to use Seaborn anyway, so for the first five examples let’s load from it



iris = sns.load_dataset("iris")
iris.head()

Copy the code
Out[2]:
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

In each example (such as flowers), there are five fields, four of which are the size of the flower (cm) and the fifth is the variety of iris (category). There are three varieties: Setosa, Verscicolor and Virginica. Our job is to build a classifier that can predict its species by taking measurements of two petals and two sepals. Before we start building the model, let’s start with a visualization (it’s usually a good idea to start with a visualization) :

sns.pairplot(iris, hue='species'); / Users/WWJ/env/lib/python2.7 / site - packages/matplotlib/set py: 892: UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle; please use the latter. warnings.warn(self.msg_depr % (key, alt_key))Copy the code

Shard/convert data for training and testing¶

First we need to pull the data from the Iris Dataframe. We store the petal and sepal data in the array X (features), and the classification labels in the corresponding array Y (targets).

In [4]:


X = iris.values[:, :4]
y = iris.values[:, 4]

Copy the code

In standard supervised learning, we train some of the data in the data set and use the rest to test the performance of our model. Although it’s easy to do manually, this work has been abstracted into the train_test_split() method in scikit-learn

Train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0)Copy the code

Train a Scikit-Learn classifier¶

We will train a logistic regression (or Logisitic regression) classifier. Using Scikit-Learn’s built-in Hyper-Paramter cross-validation, it takes just one line of code. Like the Estimator object for Scikit-Learn, the LogisticRegressionCV classifier has a.fit() method that adjusts the model parameters to fit the training data. This method is all we need. All we need is:

lr = LogisticRegressionCV() lr.fit(train_X, train_y) LogisticRegressionCV(Cs=10, class_weight=None, cv=None, Dual =False, intercept_scaling=1.0, max_iter=100, multi_class='ovr', n_jobs=1, penalty='l2', Random_state =None, refit=True, scoring=None, solver=' LBFGS ', tol=0.0001, verbose=0)Copy the code

Obtain the classification accuracy of the classifier¶

Now we can use the test set to measure the classification accuracy of the trained classifier:

Print ("Accuracy = {:.2f}". Format (lr.score(test_X, test_y))) Accuracy = 0.83Copy the code

Use Keras to do this (similar to above)¶

Keras is an advanced neural network library created by Google’s Francois Chollet. It was first submitted to Github on March 27 last year, when it was just one year old

As we can see, building a classifier with Scikit-Learn is pretty easy:

  • Instantiate the classifier with a single line of code
  • Train it with a line of code
  • A line of code is used to measure the accuracy of the classifier

Creating a classifier in Keras is only a little more complicated than the above process. The data needs a little tweaking, and before we can instantiate a neural network into a classifier, we need to do some work to define the neural network. In other ways, it’s very similar to Scikit-Learn

The first step is to adjust the data: The Scikit-Learn classifier accepts a string tag, such as “setosa, “while the keras tag must be one-hot-encoded, which means we need to transfer the data to:

setosa
versicolor
setosa
virginica
...Copy the code

Change to the following form:

setosa versicolor virginica
     1          0         0
     0          1         0
     1          0         0
     0          0         1Copy the code

There are many ways to do this, but if you’re familiar with Pandas, you can use pandas. Get_dummies (), one-hot-encoding built into Scikit-learn. We only use some of Keras’s tools and some numPY knowledge

def one_hot_encode_object_array(arr):
    '''One hot encode a numpy array of objects (e.g. strings)'''
    uniques, ids = np.unique(arr, return_inverse=True)
    return np_utils.to_categorical(ids, len(uniques))

train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)Copy the code

Construct the neural network model¶

In this example, aside from needing to transform the data, the most significant and important difference between using Keras (versus Scikit-Learn) is that you must determine the structure of the model before you instantiate it

In Scikit-Learn, the model is ready-made. However, Keras is a neural network library. Thus, while the number of features/categories in the data gives constraints, you can determine other aspects of the model: the number of layers, the size of each layer, the nature of the direct connections between layers, and so on. (If you haven’t already, Keras is a great place to experiment.)

The price of this freedom is that instantiating the minimum classifier involves a little more work than the one line of code required by Scikit-Learn.

In our example, we will create a simple network. The data gives us two choices. We have four eigenvalues and three target classifications, so the input layer must have four units (neurons) and the output layer must have three units. We just need to define the hidden layer, and in this project we will have a hidden layer and give it 16 units. From the GPU’s perspective, 16 is an integer! When you use a neural network, you’ll see that the number two often has the privilege.

We will define our model in the most general way: by defining it as a sequential stack, or alternatively by defining it as a computational graph. But we use Sequential() here.

Model = Sequential () # order model reference: http://keras-cn.readthedocs.io/en/latest/Copy the code

The next two lines define the size of the input layer (input_SHAPE =(4,)), and the size and activation functions of the hidden layer

Model. Add (Dense(16, input_shape=(4,)))) http://keras-cn.readthedocs.io/en/latest/layers/core_layer/ model. The add (Activation (' sigmoid) # Activation function, Refer to http://keras-cn.readthedocs.io/en/latest/other/activations/Copy the code

Next, define the size and activation functions of the output layer:

model.add(Dense(3))
model.add(Activation('softmax'))Copy the code

Finally, we specify the optimization strategy and the loss function. We also calculate accuracy

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=["accuracy"])Copy the code

Use the divine network classifier¶

Now that we have defined the structure of the model and compiled, we have an API that is basically the same as the object for Scikit-Learn. It also has.fit() and.predict() methods. Next, fit the model and data together!

Training for god-level networks often involves a concept called minibatching, which means presenting a subset of the data to the network, adjusting weights, and then presenting another subset to the network. After neural networks learn all the data, they are called epochals. Adjusting miniBatch /epoch policies is a specialized problem. In this case we use only one MiniBatch, which makes it good for using stochastic gradient descent. For example, the neural network scans the data of a flower sample one at a time and adjusts the weights accordingly

It doesn’t matter if the following expression contains verbose=0. If you want to adjust the miniBatch/EPOCH policy, you need to re-run the previous model.compile() to reset the model weights each time



model.fit(train_X, train_y_ohe, nb_epoch=100, batch_size=1, verbose=0);

Copy the code

In general, the only difference between the compiled Keras model and the Scikit-Learn classifier is that scikit-learn’s.score() method corresponds to Keras’s.evaluate().

When we compile the model, evaluate() returns the loss function and any other metrics we need. In our example, we need accuracy, which can be likened to the accuracy returned by the score() method of the Scikit-Learn LogisticRegressionCV classifier.

loss, accuracy = model.evaluate(test_X, test_y_ohe, Verbose =0) print("Accuracy = {:.2f}". Format (Accuracy)) Accuracy = 0.99Copy the code

As you can see, the neural network model tests better than the simple logistic regression classifier.

This is comforting, but not surprising. Even a very simple neural network has more flexibility to learn more complex classification problems than logistic regression, so it certainly does better than logistic regression

And it exposes one of the risks of neural networks: overfitting. Previously, we had carefully selected and tested the test set, but the test set was so small that 99% accuracy seemed very high, and I wouldn’t be surprised if there was some overfitting afterwards. You can add dropout (built into Keras), which is equivalent to the regularization used by the LogisticRegression classifier.

So that’s it. The divine network has overcorrected this problem, and the treatment of accuracy is not what I want to show you. This article is intended to demonstrate that using an advanced source code library with built-in batteries, we can build, train, and use a neural network model with a small amount of code, rather than a traditional model

The next step¶

We built a very simple feedforward neural network. To explore further, you can load the MNIST Database of Handwritten Digits and see if you can beat a standard Scikit-Learn classifier. In this solution, unlike the Iris data set, the power of the divine network and the complexity associated with it are justified. You can try it yourself and if you get stuck, you can take a look at the notebook

It’s easy to find tutorials for your favorite languages and libraries. If you’re interested in learning more about concepts and math background, try the following resources:

  • Michael Nielsen’s online book on Neural Networks and Deep Learning(especially chapters 1,2,3)
  • Ng’s Coursera course: Machine Learning (4-5 weeks),
  • Deep Learning by Yan Le Cun
  • Wonderful article by Chris Olah, especially back Propagation and Recurrent Neural Networks

Speaking of regression neural networks, Keras also has Layers that let you build models:

  • Convolution layer, the most advanced solution to machine vision problems
  • The recursion layer is particularly suitable for modeling languages and other sequential data.

In fact, the power of neural networks comes from their composability. Using a high-level library like Keras, it takes only a moment to build a different neural network. Building models is a lot like building Legos. It’s fun!

The original¶

  • “Hello world” in keras

In [ ]: