I am participating in the Mid-Autumn Festival creative submission contest

U – Net

The network structure of U-NET is shown in the figure below, which is a standard coding-decoding network. The whole network looks like a letter U, hence the name U-NET. The left half of the network is the contraction path (encoding), and the right half is the expansion path (decoding). The encodings follow a typical convolutional network architecture and consist of repeated applications of 3×33\times 33×3 convolution (no padding), followed by a recirculating linear unit (ReLU) and a 2×22 \ Times 22×2 maximum pooling operation of step 222 for downsampling. Each step of the decoding contains an upsampling of the feature graph, including a 2×22\times 22×2 convolution (up-convolution) that halves the number of feature channels, a connection with the corresponding clipped feature map in the decoding, and two 3×33\times 33×3 convolution, each followed by a ReLU.

sketchKeras

SketchKeras is an open source u-NET based contour extraction project, project link. The disadvantage is that the author does not disclose the model structure and training algorithm, but there are some ways to analyze the mod.h5 provided by it to infer the rough structure of its network.

Analysis of its source code, mainly including:

Input image preprocessing

    width = float(from_mat.shape[1])
    height = float(from_mat.shape[0])
    new_width = 0
    new_height = 0
    if (width > height):
        from_mat = cv2.resize(from_mat, (512, int(512 / width * height)), interpolation=cv2.INTER_AREA)
        new_width = 512
        new_height = int(512 / width * height)
    else:
        from_mat = cv2.resize(from_mat, (int(512 / height * width), 512), interpolation=cv2.INTER_AREA)
        new_width = int(512 / height * width)
        new_height = 512
    cv2.imshow('raw', from_mat)
    cv2.imwrite('raw.jpg',from_mat)
    from_mat = from_mat.transpose((2, 0, 1))
    light_map = np.zeros(from_mat.shape, dtype=np.float)
    for channel in range(3):
        light_map[channel] = get_light_map_single(from_mat[channel])
    light_map = normalize_pic(light_map)
    light_map = resize_img_512_3d(light_map)
Copy the code

Model to deal with

line_mat = mod.predict(light_map, batch_size=1)
Copy the code

tailoring

    line_mat = line_mat.transpose((3, 1, 2, 0))[0]
    line_mat = line_mat[0:int(new_height), 0:int(new_width), :]
    show_active_img_and_save('sketchKeras_colored', line_mat, 'sketchKeras_colored.jpg')
    line_mat = np.amax(line_mat, 2)
Copy the code

Noise reduction and picture output

    show_active_img_and_save_denoise_filter2('sketchKeras_enhanced', line_mat, 'sketchKeras_enhanced.jpg')
    show_active_img_and_save_denoise_filter('sketchKeras_pured', line_mat, 'sketchKeras_pured.jpg')
    show_active_img_and_save_denoise('sketchKeras', line_mat, 'sketchKeras.jpg')
    cv2.waitKey(0)
Copy the code

Find a picture of a mooncake from the Internet

The experimental results

Compared with the method without neural networks

The steps of the traditional method for line draft extraction are as follows: turning grayscale image, inverting phase, Gaussian blur and finally color reduction. The results are shown as follows. It can be seen that there is obviously a big gap between the contour extracted by this method and the neural network method.