Web version based on tensorflow.js you draw I guess

Some time ago popular circle of friends [guess painting small song] we should have played, draw a soul painting by AI guess. Colab training model, and based on tensorflow. js browser version of you draw I guess, the following is the translation of the original text

The code and the Demo

The Demo code: https://github.com/zaidalyafeai/zaidalyafeai.github.io/tree/master/sketcher: https://zaidalyafeai.github.io/sketcher/ Google Colab notebook: https://colab.research.google.com/github/zaidalyafeai/zaidalyafeai.github.io/blob/master/sketcher/Sketcher.ipynb

The data set

We used the convolutional neural network CNN to identify the types of drawn images and conducted model training on the Quick Draw dataset of about 50 million hand-drawn images in 345 categories

process

We’ll use the Keras framework to train the model on a Free Google Colab GPU, and then run the model on tensorflow.js in the browser. I wrote the tensorflow.js tutorial earlier, which you can check out before continuing. The following figure shows the implementation process of this project

Train on Colab

Google offers free Gpus. You can see how to create a Notebook and program a GPU in this tutorial

The import

We use Keras, running with TensorFlow as the back end

import os
import glob
import numpy as np
from tensorflow.keras import layers
from tensorflow import keras 
import tensorflow as tfCopy the code

Load the data

Due to limited memory, we did not train all the categories, only 100 of them were selected as the data set. The data of each category can be found in the form of numpy array on Google Cloud. The size of the array is [N, 784], where N is the number of images under this category. Let’s download this data set first

import urllib.request
def download():
 
  base = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'
  for c in classes:
    cls_url = c.replace('_', '%20')
    path = base+cls_url+'.npy'
    print(path)
    urllib.request.urlretrieve(path, 'data/'+c+'.npy')Copy the code

Due to limited memory, we only loaded 5000 images in each category and previewed 20% of them as test data

Def load_data(root, vfold_ratio=0.2, max_items_per_class= 5000): all_files = glob.glob(os.path.join(root, '*.npy')) #initialize variables x = np.empty([0, 784]) y = np.empty([0]) class_names = [] #load a subset of the data to memory for idx, file in enumerate(all_files): data = np.load(file) data = data[0: max_items_per_class, :] labels = np.full(data.shape[0], idx) x = np.concatenate((x, data), axis=0) y = np.append(y, labels) class_name, ext = os.path.splitext(os.path.basename(file)) class_names.append(class_name) data = None labels = None #separate into training and testing permutation = np.random.permutation(y.shape[0]) x = x[permutation, :] y = y[permutation] vfold_size = int(x.shape[0]/100*(vfold_ratio*100)) x_test = x[0:vfold_size, :] y_test = y[0:vfold_size] x_train = x[vfold_size:x.shape[0], :] y_train = y[vfold_size:y.shape[0]] return x_train, y_train, x_test, y_test, class_namesCopy the code

Data preprocessing

Before training the model, the data needs to be preprocessed. The model will use batch processes of size [N, 28, 28, 1] and output probabilities of size [N, 100].

# Reshape and normalize x_train = x_train.reshape(x_train.shape[0], image_size, image_size, 1).astype('float32') x_test = x_test.reshape(x_test.shape[0], image_size, image_size, 1). Astype ('float32') x_train /= 255.0 x_test /= 255.0 # Convert class vectors to class matrices y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes)Copy the code

Create the model

We will create a simple convolutional neural network, the simpler the model and the fewer parameters the better. Because we’ll be running the model in a browser, and we want to get the prediction results quickly. Therefore, our model only contains three convolution layers and two fully connected layers:

# Define model model = keras.Sequential() model.add(layers.Convolution2D(16, (3, 3), padding='same', input_shape=x_train.shape[1:], activation='relu')) model.add(layers.MaxPooling2D(pool_size=(2, 2))) model.add(layers.Convolution2D(32, (3, 3), padding='same', activation= 'relu')) model.add(layers.MaxPooling2D(pool_size=(2, 2))) model.add(layers.Convolution2D(64, (3, 3), padding='same', Activation = 'relu') model.add(layers.maxpooling2d (pool_size =(2,2))) model.add(layers.flatten ()) model.add(layers.Dense(128, activation='relu')) model.add(layers.Dense(100, activation='softmax')) # Train model adam = tf.train.AdamOptimizer() model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['top_k_categorical_accuracy']) print(model.summary())Copy the code

Fitting, validation and testing

Then we conducted 5 rounds of training for the model, divided the training data into 256 batches of input models, and separated 10% as validation sets.

If (x = x_train, y = y_train, validation_split=0.1, batch_size = 256, verbose=2, epochs=5) #evaluate on unseen data score = model.evaluate(x_test, y_test, verbose=0) print('Test accuarcy: {: 0.2 f} % '. The format (score [1] * 100))Copy the code

Here are the results

The test results reached the top 5 accuracy rate of 92.20%

Prepare the model in Web format

After obtaining the model with the expected accuracy, we saved the model

model.save('keras.h5')Copy the code

Install tensorflow. Js

! pip install tensorflowjsCopy the code

The transformation model

! mkdir model ! tensorflowjs_converter --input_format keras keras.h5 model/Copy the code

After the transformation, some weight files and JSON files containing the model schema are generated to zip the model so that it can be downloaded to a local machine:

! zip -r model.zip modelCopy the code

Download the model

from google.colab import files
files.download('model.zip')Copy the code

Make predictions in the browser

In this section, we’ll show you how to load the model in the browser and make predictions. We’re going to create a 300 by 300 canvas, and I’m not going to cover the canvas implementation here, but rather focus on the tensorflow.js section

Load model

To use tensorflow.js, we first need to load the corresponding script

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@latest"> </script>Copy the code

You need a running server on your local machine to host the weight files, either using apache or hosting them on GitHub like I did

model = await tf.loadModel('model/model.json')Copy the code

pretreatment

Before the prediction, the obtained data need to be processed to some extent. First, the picture data should be obtained from canvas

//the minimum boudning box around the current drawing
const mbb = getMinBox()
//cacluate the dpi of the current window 
const dpi = window.devicePixelRatio
//extract the image data 
const imgData = canvas.contextContainer.getImageData(mbb.min.x * dpi, mbb.min.y * dpi,
						       (mbb.max.x - mbb.min.x) * dpi, (mbb.max.y - mbb.min.y) * dpi);Copy the code

GetMinBox () will be introduced later in the article. The DPI variable is used to stretch the clipped-out canvas according to the density of screen pixels. We convert the current image data of the canvas into a tensor, adjust the size and normalize it:

function preprocess(imgData) { return tf.tidy(()=>{ //convert the image data to a tensor let tensor = tf.fromPixels(imgData, numChannels= 1) //resize to 28 x 28 const resized = tf.image.resizeBilinear(tensor, [28, 28]).tofloat () // Normalize the image const offset = tf.scalar(255.0); Const normalized = tf. The scalar (1.0). The sub (resized. Div (offset)); //We add a dimension to get a batch shape const batched = normalized.expandDims(0) return batched }) }Copy the code

We use Model.predict for prediction, which returns a probability of size “N, 100”.

const pred = model.predict(preprocess(imgData)).dataSync()Copy the code

We can sort this to get the probability of the Top5

Improve accuracy

The input data accepted by our model is a tensor of size [N, 28, 28, 1]. The size of our drawing canvas is 300*300, but the user may have drawn a very small image on this canvas, so we need to cut out the part containing the hand-painted image by finding the points on the upper left and lower right of the hand-painted content

//record the current drawing coordinates 	  
function recordCoor(event)
{
  //get current mouse coordinate 
  var pointer = canvas.getPointer(event.e);
  var posX = pointer.x;
  var posY = pointer.y;
 
  //record the point if withing the canvas and the mouse is pressed 
  if(posX >=0 && posY >= 0 && mousePressed)  
  {	  
    coords.push(pointer) 
  } 
}
 
//get the best bounding box by finding the top left and bottom right cornders    
function getMinBox(){
 
   var coorX = coords.map(function(p) {return p.x});
   var coorY = coords.map(function(p) {return p.y});
   //find top left corner 
   var min_coords = {
    x : Math.min.apply(null, coorX),
    y : Math.min.apply(null, coorY)
   }
   //find right bottom corner 
   var max_coords = {
    x : Math.max.apply(null, coorX),
    y : Math.max.apply(null, coorY)
   }
   return {
    min : min_coords,
    max : max_coords
   }
}Copy the code

Hand-painted test

The graph below shows some of the first drawn images and the categories with the highest accuracy. All of the hand-drawn images were drawn by me with a mouse

The original address: https://medium.com/tensorflow/train-on-google-colab-and-run-on-the-browser-a-case-study-8a45f9b1474e