This article introduces an application scenario of TensorFlow Lite and OpenCV, and details the entire link flow of the SSD model used in TensorFlow Lite from training to end-to-end use. In the application scenario of the APP, when users publish pictures, they can detect and locate watermarks on the end and provide the function of removing watermarks.

Specific steps are as follows:

1. Used TensorFlow Object Detection API for SSD model training; 2. Optimized and converted the model; The outputting locations were passed through the NMS(non-maximum suppression) algorithm to get the optimal box 4, and OpenCV was used to remove the watermark

Libraries and tools used:

TensorFlow v: 1.8 r +

V: TensorFlowLite hundreds +

OpenCV

labelImg

The SSD detects and locates watermarks

Introduction of SSD

SSD, short for Single Shot MultiBox Detector, is a target detection algorithm proposed by Wei Liu in ECCV 2016. It is one of the main detection frameworks so far, and has obvious speed advantages compared with Faster RCNN. It has a clear mAP advantage over YOLO (although it has been surpassed by CVPR 2017 YOLO9000). SSDS have the following features:

1. The idea of transforming detection into regression is inherited from YOLO, and network training can be completed at the same time

2. Based on Anchor in Faster RCNN, a similar prior box was proposed

3. Add the detection method based on Pyramidal Feature Hierarchy, which is equivalent to half FPN idea

TensorFlow Object Detection API provides weights for pre-training of network structures for target Detection, all of which are trained with COCO data sets. The accuracy and calculation time of each model are as follows:

We directly use the model retraining provided by TensorFlow, so we can focus on the project without rebuilding the network. This paper chooses SSD-300 Mobilenet-based model

1.1 Model training

1. Configure the environment

TensorFlow Object Detection API code library: github.com/tensorflow/…

1.2 Compiling the Protobuf library, which is used to configure the model and training parameters, download the directly compiled PB library (github.com/protocolbuf…) After decompressing the package, add environment variables:

1.3 Add Models and Slim to Python environment variables:

2. Data preparation

LabelImg TensorFlow Object Detection API training needs well labeled images, recommended to use labelImg, is an open source image labeling tool, download link: github.com/tzutalin/la…

LabelImg interface:

3, training

Open the downloaded coco pretraining model folder, place the model. CKPT file in the folder to be trained, and modify the ssd_mobilenet_v1_pets.config file in two places:

1, num_classes: change to its own classes num

2, change all PATH_TO_BE_CONFIGURED places to the path you set earlier

Call train.py to start training:

Pipelineconfigpath is the training configuration file path train_dir is the training output path

1.2 Optimization and transformation of the model

Finally, the trained PB model was optimized by the official Optimize_for_inference, and then converted to the TFLite model by toCO (the path needed to be modified), referring to the Issues updated by the official GitHub:

1.3 Executing SSDS on tflite

The ssD_mobilenet. tflite model we use in this case has an input and output data type of FLOAT32. There is no full connection layer in SSD, which can adapt to pictures of various sizes. The shape of our model is {1, 300, 300, 3}.

The code for image input is as follows:

The structure of the output is an array containing Locations and Classes as follows:

By traversing the output and activating the sigmoID function, score is obtained, and the index of class and location greater than 0.8 is saved

OutputLocations resolution:

OutputLocations ->data.f (0, 0, 0); outputLocations->data.f (0, 0, 0); Then the coordinates of the box are obtained by the following code and the recognized categories and scores are output:

1.4 Non-maximum Suppression (NMS)

After parsing, an object will get multiple positioned boxes, how to determine which one is the most accurate box we need? We use non-maximum suppression to suppress redundant boxes: the suppression process is an iteration-traversal-elimination process.

1. Sort the scores of all boxes, select the highest score and its corresponding box 2, traverse the rest of the boxes, if the overlap area (IOU) with the current highest score box is greater than a certain threshold, we will delete the box. 3. Continue to select the box with the highest score from the unprocessed box and repeat the process.

After processing:

OpenCV to watermark

There are two ways to remove watermarks in Opencv:

One is direct use of inPainter function (low quality processing, can handle opaque watermarks), the other is pixel-based invert neutralization (high quality processing, can only handle translucent watermarks, not verified)

Inpainter function:

Algorithm theory: Based on the fast moving repair algorithm (FMM algorithm) proposed by Telea in 2004, the pixels on the edge of the area to be repaired are first processed, and then the layers are pushed inward until all the pixels are repaired

Treatment: Then use Inpaint to process the original image. Because the positioning size obtained by SSD is difficult to be completely accurate, the mask watermark area can be properly enlarged, because the processing of this method is performed from the edge to the inside. This ensures that the watermark can be completely covered by the mask

The following mask image is generated through the watermark position and watermark style (the size should be the same as the original image)

After processing:

Pixel-based color reversal neutralization:

This method can remove semi-transparent watermarks in fixed positions. The principle of the algorithm is to use the watermark mask image to reverse the watermark image and calculate the original color value of the watermark position.

conclusion

In practical applications, the watermark position is basically fixed, which is in the four corners or center of the image. Therefore, in the SSD detection process, it can be considered to add rules or try to use the attention model to increase the processing weight of the four corners and the middle part to improve efficiency and accuracy. This method also has a problem, at present is to must want to know in advance the specific style of the watermark, and will contain the watermark images for training, if there is a new watermark can’t to make the correct recognition and removal, later we will try to directly by GAN repair way watermarking images, have done related to welcome to discuss.