This article has been published on the public account GainAnEpoch, please contact me for reprinting.

Basic flow using TensorFlow

This article will introduce the basic process of training model using TensorFlow, including making and reading TFRecord, training and saving model, and reading model.

To prepare

  • Language: Python3
  • Libraries: Tensorflow, Cv2, Numpy, Matplotlib
  • Dataset: The numerical part of the Chars74K dataset
  • Network :CNN All codes have been uploaded to github: github.com/wmpscc/Tens…

TFRecord

TensorFlow provides a unified format for storing data, called TFRecord.

message Example { Features features = 1; }; message Features{ map<string,Feature> featrue = 1; }; message Feature{ oneof kind{ BytesList bytes_list = 1; FloatList float_list = 2; Int64List int64_list = 3; }};Copy the code

Tf.train. Example contains a dictionary whose key is a string with a value of Feature. Feature can be a string (BytesList), a FloatList (FloatList), or a list of integers (Int64List).

Writing a TFRecord is generally divided into three steps:

  • Read the data to be converted
  • Convert the data to the Example Protocol Buffer and write to the data structure
  • After converting the data to a string, write it through TFRecordWriter

Methods a

This time our data is stored in multiple folders, so the most direct way to read data is to traverse all files in the directory, and then read and write the TFRecord file. This method corresponds to the file maketfRecord.py, and let’s look at the key code

filenameTrain = 'TFRecord/train.tfrecords' filenameTest = 'TFRecord/test.tfrecords' writerTrain = tf.python_io.TFRecordWriter(filenameTrain) writerTest = tf.python_io.TFRecordWriter(filenameTest) folders = os.listdir(HOME_PATH) for subFoldersName in folders: label = transform_label(subFoldersName) path = os.path.join(HOME_PATH, SubFoldersNameList = os.listdir(path) I = 0 for imageName in subFoldersNameList: imagePath = os.path.join(path, imageName) images = cv2.imread(imagePath) res = cv2.resize(images, (128, 128), interpolation=cv2.INTER_CUBIC) image_raw_data = res.tostring() example = tf.train.Example(features=tf.train.Features(feature={ 'label': _int64_feature(label), 'image_raw': _bytes_feature(image_raw_data) })) if i <= len(subFoldersNameList) * 3 / 4: writerTrain.write(example.SerializeToString()) else: writerTest.write(example.SerializeToString()) i += 1Copy the code

When doing the data, I plan to use three quarters of the data as the training set and the remaining quarter as the test set, saving it in two files for convenience. The basic process is to traverse all the folders in the Fnt directory, then go into the subfolders to traverse the image files in its directory, then use OpenCV imread method to read it, and then convert the image data into a string. The ‘_bytes_feature’ in the data structure provided by TFRecord stores strings. The above image has been successfully read and written into the data structure of TFRecord. What about the label corresponding to the image?

def transform_label(folderName):
    label_dict = {
        'Sample001': 0,
        'Sample002': 1,
        'Sample003': 2,
        'Sample004': 3,
        'Sample005': 4,
        'Sample006': 5,
        'Sample007': 6,
        'Sample008': 7,
        'Sample009': 8,
        'Sample010': 9,
        'Sample011': 10,
    }
    return label_dict[folderName]
Copy the code

I set up a dictionary. Since the pictures under a file are all in the same category, the mapping relationship is generated between the folder name corresponding to the picture and its corresponding label. Label = transform_label(subFoldersName) is the label of the image obtained by this method.

Method 2

When using the data generated by method 1 to train the model, it will be found very easy to generate overfit. Although tF.train. Shuffle_batch method can be used to scramble the data in the queue and read it again, too much data in one class will result in the same data even after scrambling. For example, if the number 0 has 1000 samples, let’s say you read a queue of 1000, so that even if the queue is shuffled, the image read will still be 0. This is easy to over-fit during training. To avoid this, my idea is to scramble images and write them while doing data. The corresponding file maketFRecord2.py has the following key codes

folders = os.listdir(HOME_PATH) for subFoldersName in folders: path = os.path.join(HOME_PATH, SubFoldersNameList = os.listdir(path) for imageName in subFoldersNameList: imagePath = os.path.join(path, ImageName) totalList. Append (imagePath) # Generate a non-repeating random number sequence with the length of the total number of images dictList = random. Sample (range(0, len(totalList)) Len (totalList)) print(totalList[0].split('\\')[1].split('-')[0]) # images = cv2.imread(totalList[dictlist[i]]) res = cv2.resize(images, (128, 128), interpolation=cv2.INTER_CUBIC) image_raw_data = res.tostring() label = transform_label(totalList[dictlist[i]].split('\\')[1].split('-')[0]) print(label) example = tf.train.Example(features=tf.train.Features(feature={ 'label': _int64_feature(label), 'image_raw': _bytes_feature(image_raw_data) })) if i <= len(totalList) * 3 / 4: writerTrain.write(example.SerializeToString()) else: writerTest.write(example.SerializeToString()) i += 1Copy the code

Basic process: Walk through all the images in the directory, adding its path to a large list. Which image to use is controlled by a non-repeating sequence of random numbers. So this is going to be random. How do I get the tag? Image files are named in the form of type-serial number. Here, by obtaining its type name, a dictionary is established to generate mapping relations.

def transform_label(imgType):
    label_dict = {
        'img001': 0,
        'img002': 1,
        'img003': 2,
        'img004': 3,
        'img005': 4,
        'img006': 5,
        'img007': 6,
        'img008': 7,
        'img009': 8,
        'img010': 9,
        'img011': 10,
    }
    return label_dict[imgType]
Copy the code

Full size image CNN

How to read TFRecord data when corresponding to cnn_train. py file training, refer to the following code

Def read_train_data(): reader = tf.TFRecordReader() filename_train = tf.train.string_input_producer(["TFRecord128/train.tfrecords"]) _, serialized_example_test = reader.read(filename_train) features = tf.parse_single_example( serialized_example_test, features={ 'label': tf.FixedLenFeature([], tf.int64), 'image_raw': tf.FixedLenFeature([], tf.string), } ) img_train = features['image_raw'] images_train = tf.decode_raw(img_train, tf.uint8) images_train = tf.reshape(images_train, [128, 128, 3]) labels_train = tf.cast(features['label'], tf.int64) labels_train = tf.cast(labels_train, tf.int64) labels_train = tf.one_hot(labels_train, 10) return images_train, labels_trainCopy the code

The stored data is read out using features, and the key names and data types are the same as those written. About the convolutional neural network here, I refer to the code written by Senior Wang during his training. Of course, it will not be possible to copy it, as there will be loss NaN. My solution is to copy AlexNet and add LRN layer after convolution to carry out local response normalization. When setting parameters, add the L2 regular item. The key code is as follows

def weights_with_loss(shape, stddev, wl): var = tf.truncated_normal(stddev=stddev, shape=shape) if wl is not None: weight_loss = tf.multiply(tf.nn.l2_loss(var), wl, name='weight_loss') tf.add_to_collection('losses', weight_loss) return tf.Variable(var) def net(image, drop_pro): W_conv1 = weights_with_loss([5, 5, 3, 32], 5e-2, wl=0.0) b_conv1 = biasses([32]) conv1 = tb.nn.relu (conv(image, W_conv1) + b_conv1) pool1 = max_pool_2x2(conv1) norm1 = tf.nn.lRN (pool1, 4, bias=1, alpha= 0.001/9.0, W_conv2 = weights_with_loss([5, 5, 32, 64], stddev=5e-2, Wl =0.0) b_conv2 = biasses([64]) conv2 = tf.nn.relu(conv(norm1, W_conv2) + b_conv2) norm2 = tf.nn.lrn(conv2, 4, bias=1, Alpha = 0.001/9.0, beta=0.75) pool2 = max_pool_2x2(norm2) W_conv3 = weights_with_loss([5, 5, 64, 128], stddev=0.04, Wl =0.004) b_conv3 = biasses([128]) conv3 = tf.nn.relu(conv(pool2, W_conv3) + b_conv3) pool3 = max_pool_2x2(conv3) W_conv4 = weights_with_loss([5, 5, 128, 256], stddev=1 / 128, Wl =0.004) b_conv4 = biasses([256]) conv4 = tf.nn.relu(conv(pool3, W_conv4) + b_conv4) pool4 = max_pool_2x2(conv4) image_raw = tf.reshape(pool4, shape=[-1, Weights_with_loss (shape=[8 * 8 * 256, 1024], stddev= 1/256, Wl =0.0) fc_b1 = biasses(shape=[1024]) fc_1 = tf.nn.relu(tf.matmul(tf.matmul(image_raw, Dropout (fc_1, drop_pro) fc_2 = weights_with_loss([1024, 10], stddev=0.01, Wl =0.0) fc_b2 = biasses([10]) return tf.matmul(drop_out, fc_2) + fc_b2Copy the code

128x128x3 original picture training process


128 * 128







128v



128

Image size halved CNN

The only difference between cnn_train2.py and the above method is that the image size is changed from 128*128*3 to 64*64*3, so I won’t repeat the instructions here. 64x64x3 picture training process


64 * 64







64v

Save the model

In cnn_train.py, the code for saving the model is

def save_model(sess, step):
    MODEL_SAVE_PATH = "./model128/"
    MODEL_NAME = "model.ckpt"
    saver = tf.train.Saver()
    saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=step)

save_model(sess, i)
Copy the code

I is the number of iterations, without the corresponding global_step parameter

Verify accuracy on the test set

The corresponding file accuracytest. py code is basically the same as the training code, and here it directly talks about how to restore the model. The key code

ckpt = tf.train.get_checkpoint_state(MODEL_PATH) if ckpt and ckpt.model_checkpoint_path: # Load model saver.restore(sess, ckpt.model_checkpoint_path)Copy the code

It’s worth noting that tF.train.get_checkpoint_state will automatically find the most iterated model in the folder and read it in. The saver.restore(sess, ckpt.model_checkpoint_path) method restores the variable parameters of the last iteration of the model during training.

View the TFRecord picture read in

If you want to check whether the image is processed correctly when making TFRecord, the easiest way is to display the image. The key code is as follows

def plot_images(images, labels): for i in np.arange(0, 20): plt.subplot(5, 5, i + 1) plt.axis('off') plt.title(labels[i], Fontsize =14) plt.subplots_adjust(top=1.5) plt.imshow(images[I]) plt.show() plot_images(image, label)Copy the code


The sample

conclusion

I met many problems in the process of exploration. Thanks to The patient help of Mr. Wang, I hope this article can help more people. Novice on the road, if there is a mistake, welcome to correct, thank you.

The code has been uploaded to github: github.com/wmpscc/Tens…

Read the original