Introduction and practice of deep learning framework Keras

Keras is a high-level neural network API written in Python that can be run as a back end with TensorFlow, CNTK, or Theano. The focus of Keras development is to support rapid experimentation. Being able to translate your ideas into experimental results with minimal delay is the key to doing good research.

This article takes the project on Kaggle :IMDB film review sentiment Analysis as an example to learn how to build a neural network with Keras to deal with practical problems. Reading this article requires a basic understanding of neural networks.

The article is divided into two parts:

Some basic concepts in Keras. Api usage. I will give some simple usage examples, or give links to relevant knowledge.
IMDB film review emotion analysis actual combat. I’m going to use all of the stuff I talked about in Part 1.

ModelDense fully connected layer

keras.layers.core.Dense(units, activation=None, use_bias=True, k

ernel_initializer=’glorot_uniform’, bias_initializer=’zeros’, ke

rnel_regularizer=None, bias_regularizer=None, activity_regulariz

er=None, kernel_constraint=None, bias_constraint=None)

# as first layer in a sequential model:

model = Sequential()model.add(Dense(32, input_shape=(16,)))

# now the model will take as input arrays of shape (*, 16)

# and output arrays of shape (*, 32)

# after the first layer, you don’t need to specify

# the size of the input anymore:

model.add(Dense(32))Embedded layer Embedding

keras.layers.embeddings.Embedding(input_dim, output_dim, embeddi

ngs_initializer=’uniform’, embeddings_regularizer=None, activity

_regularizer=None, embeddings_constraint=None, mask_zero=False,

input_length=None)

Check out this link if you are interested
https://machinelearningmastery.c … eep-learning-keras/

Word to vector. The purpose of this layer is to get the text represented by a vector of words.

Input_dim: Size of the word list. The total number of different words.
Output_dim: The number of dimensions you want the word to be converted into.
Input_length: indicates the number of words in each sentence

For example: We input an M*50 matrix with 200 different words, and we want to convert each word into a 32-dimensional vector. It returns a tensor of (M,50,32).

There are 50 words in a sentence, and each word is a 32-dimensional vector. There are M sentences in total. So is e.s hape = (M, 50, 32)

e = Embedding(200, 32, input_length=50)

LSTM layer.

LSTM is a special case of recurrent neural network.
Deeplearning.net/tutorial/ls…

To put it simply, the neural networks we have mentioned before, including CNN, are one-way, without considering the sequence relationship, but the meaning of a word is relevant to its context, for example, “I use a Xiaomi phone and eat millet porridge “, two millet definitely do not mean the same thing. When doing semantic analysis, you need to consider the context. The recurrent neural network RNN does just that. Or “The movie was of high quality, but I didn’t like it “. There are both positive and negative comments in this sentence, and the LSTM will recognize the” but “which is what we want to focus on.

keras.layers.recurrent.LSTM(units, activation=’tanh’, recurrent_

activation=’hard_sigmoid’, use_bias=True, kernel_initializer=’gl

orot_uniform’, recurrent_initializer=’orthogonal’, bias_initiali

zer=’zeros’, unit_forget_bias=True, kernel_regularizer=None, rec

urrent_regularizer=None, bias_regularizer=None, activity_regular

izer=None, kernel_constraint=None, recurrent_constraint=None, bi

As_constraint = None, dropout = 0.0, recurrent_dropout = 0.0)

Pooling layer

Keras. The layers. Pooling. GlobalMaxPooling1D () # global maximum pool for the time signal

Stackoverflow.com/questions/4…
- Input: 3D tensor shaped like (samples, steps, features)
- Output: 2D tensor shaped like (samples, features)
keras.layers.pooling.MaxPooling1D(pool_size=2, strides=None, pad

ding=’valid’)
keras.layers.pooling.MaxPooling2D(pool_size=(2, 2), strides=None

, padding=’valid’, data_format=None)
keras.layers.pooling.MaxPooling3D(pool_size=(2, 2, 2), strides=N

one, padding=’valid’, data_format=None)
.

Data preprocessingText preprocessing

keras.preprocessing.text.text_to_word_sequence(text,

filters=base_filter(), lower=True, split=” “)
keras.preprocessing.text.one_hot(text, n,

filters=base_filter(), lower=True, split=” “)
keras.preprocessing.text.Tokenizer(num_words=None, filters=base_

filter(),

lower=True, split=” “)

Tokenizer is a tool for vectorizing text, or converting text into sequences (i.e. subscripts of words in dictionaries)

List, counting from 1).
- Num_words: None or an integer, the maximum number of words to process. If set to an integer, the word splitter is limited to processing num_words, the most common words in the dataset
- No matter what num_words are, the dictionaries in FIT_on_texts are the same, and all words have corresponding index. Except for Texts_to_sequences, the results are different.
- The sentence is represented by index of the most common (num_words -1) words.
- Note that X_t varies with num_words. Take only the most num_words-1 sentences in the dictionary. If there are particularly unusual words in a sentence, they are filtered out. For example, if the sentence =”x y z”. Y,z is not top num_words-1, the vector form of the last sentence is [x_index_in_dic].

t1=

“i love that girl”

t2=

‘i hate u’

texts=[t1,t2]tokenizer = Tokenizer(num_words=

None

)tokenizer.fit_on_texts(texts)

Get an index for each word in the dictionary.

( tokenizer.word_counts)

#OrderedDict([(

‘i’

), (

‘love’

), (

‘that’

), (

‘girl’

), (

‘hate’

), (

‘u’

)])

( tokenizer.word_index)

# {

‘i’

‘love’

‘that’

‘girl’

‘hate’

‘u’

}

( tokenizer.word_docs)

# {

‘i’

‘love’

‘that’

‘girl’

‘u’

‘hate’

})

( tokenizer.index_docs)

# {

}

tokennized_texts = tokenizer.texts_to_sequences(texts)

(tokennized_texts)

[1, 2, 3, 4], [1, 5, 6]

Each word is represented by its index

X_t = pad_sequences(tokennized_texts, maxlen=

None

)

# can be converted into

D Array is a matrix. The number of words in each text is maxlen. Non-existent words

Said.

(X_t)

[1 2 3 4][0 1 5 6]]

Sequence preprocessing

keras.preprocessing.sequence.pad_sequences(sequences, maxlen=None , dtype=’int32′, padding=’pre’, truncating=’pre’, Value =0.) returns a tensor of order 2
keras.preprocessing.sequence.skipgrams(sequence, vocabulary_size

,

window_size=4, negative_samples=1., shuffle=True,

categorical=False, sampling_table=None)
keras.preprocessing.sequence.make_sampling_table(size, sampling_

factor=1e-5)

Keras actual combat :IMDB film review emotion analysis

Introduction to data set

LabeledTrainData. TSV/imdb_master. CSV data sets reviews have been marked for film is positive/negative evaluation
The testData.tsv test set needs to predict whether comments will be positive/negative

The main steps

Data is read
Data cleaning mainly includes removing stop words, HTML tags and punctuation marks
Model building
- Embedding layer: Complete word to vector conversion
- LSTM
- Pooling layer: important feature extraction is completed
- Full connection layer: classification

The data load

import

pandas

import

matplotlib.pyplot

plt

import

numpy

npdf_train = pd.read_csv(

“./dataset/word2vec-nlp-tutorial/labeledTrainData.tsv”

, header=

, delimiter=

“\t”

, quoting=

)df_train1=pd.read_csv(

“./dataset/imdb-review-dataset/imdb_master.csv”

,encoding=

“latin-1”

)df_train1=df_train1.drop([

“type”

‘file’

],axis=

)df_train1.rename(columns={

‘label’

‘sentiment’

‘Unnamed: 0’

‘id’

‘review’

}, inplace=

True

)df_train1 = df_train1[df_train1.sentiment !=

‘unsup’

]df_train1[

‘sentiment’

] = df_train1[

‘sentiment’

].map({

‘pos’

‘neg’

})new_train=pd.concat([df_train,df_train1])Data cleaning

Process HTML data with BS4

Filter out words
Remove stop words

import

from

bs4

import

BeautifulSoup

from

nltk.corpus

import

stopwordsdef review_to_words( raw_review ): review_text = BeautifulSoup(raw_review,

‘lxml’

).get_text() letters_only = re.sub(

“[^a-zA-Z]”

“”

, review_text) words = letters_only.lower().split() stops = set(stopwords.words(

“english”

)) meaningful_words = [w

for

words

not

stops]

return

(

“”

.join( meaningful_words )) new_train[

‘review’

]=new_train[

‘review’

].apply(review_to_words)df_test[

“review”

]=df_test[

“review”

].apply(review_to_words)Keras builds the network

Text is converted to a matrix

– Tokenizer applies to list(sentence) to get the dictionary. Replace the word Index in the dictionary to get the number matrix

– PAD_SEQUENCES complement 0. Ensure that each row of the matrix has the same number. That is, each sentence has the same number of words.

list_classes = [

“sentiment”

]y = new_train[list_classes].valuesprint(y.shape)list_sentences_train = new_train[

“review”

]list_sentences_test = df_test[

“review”

]max_features = 6000tokenizer = Tokenizer(num_words=max_features)tokenizer.fit_on_texts(list(list_sentences_train))list_tokenized_train = tokenizer.texts_to_sequences(list_sentences_train)list_tokenized_test = tokenizer.texts_to_sequences(list_sentences_test)print(len(tokenizer.word_index))totalNumWords = [len(one_comment) for one_comment in list_tokenized_train]print(max(totalNumWords),sum(totalNumWords) / len(totalNumWords))maxlen = 400X_t = pad_sequences(list_tokenized_train, maxlen=maxlen)X_te = pad_sequences(list_tokenized_test, maxlen=maxlen)

Model building

Amount of words to

inp = Input(shape=(maxlen, ))print(inp.shape)

# (? , 400) # 400 words per sentence

embed_size = 128

# Each word is converted into a 128-dimensional vector

x = Embedding(max_features, embed_size)(inp)print(x.shape)

# (? , 400, 128)

LSTM has 60 neurons
GlobalMaxPool1D is equivalent to extracting the most important neuronal output
DropOut Dismisses part of the output and introduces regularization to prevent overfitting
Dense fully connected layer
Model compilation specifies loss functions, optimizers, and model performance metrics

x = LSTM(

return

_sequences=True,name=

‘lstm_layer’

)(x)print(x.shape)x = GlobalMaxPool1D()(x)print(x.shape)x = Dropout(0.1)(x)print(x.shape)x = Dense(50, Activation =”relu”)(x)print(x.shape)x = Dropout(0.1)(x)print(x.shape)x = Dense(1, activation=”sigmoid”)(x)print(x.shape)model = Model(inputs=inp, outputs=x)model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

Model training

batch_size = 32epochs = 2print(X_t.shape,y.shape)model.fit(X_t,y, batch_size=batch_size, epochs=epochs, Validation_split = 0.2)

Using model prediction

prediction

= model.predict(X_te)

y_pred

= (prediction >

0.5

) Original addresswww.cnblogs.com/sdu20112013…

More Java learning materials can be found at itheimaGZ

Introduction and practice of deep learning framework Keras

Related Posts

Cloud Atlas – a personal cloud album

Have a life besides code ($500 to build a rental house, arrange)

I finally got this Dubbo and field skills note, and you haven’t read it yet? – With my 10 years of Ali architecture experience!