Computational vision: Unsupervised learning and Image Segmentation

Supervised Learning and Unsupervised Learning

In the watermelon book (Machine Learning by Zhou Zhihua), there is an introduction to supervised and unsupervised learning. Here we use Wikipedia’s interpretation of both:

Supervised learning is a machine learning task based on input-output pairs of functions that map inputs to outputs. It extrapolates a function from labeled training data composed of a set of training examples. In supervised learning, each example is a pair of an input object (usually a vector) and a desired output value (also known as a supervised signal). The supervised learning algorithm analyzes the training data and generates an inference function for mapping new examples. In human and animal psychology, this type of task is often referred to as conceptual learning.

Unsupervised learning is a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and minimal human supervision. Unlike supervised learning, which typically uses human-labeled data, unsupervised learning (also known as self-organized learning) allows modeling of the probability density of inputs. Together with supervised and reinforcement learning, it forms one of the three main categories of machine learning. The two main methods of unsupervised learning are principal component analysis and cluster analysis, in which cluster analysis is a branch of machine learning that groups data without labeling, classification, or classification.

We can understand that the key difference between supervised and unsupervised learning is whether there is a set of “standard answers”, that is, whether there are labels to monitor and improve the learning results. The most typical supervised and unsupervised learning problems correspond to classification (supervised learning) and clustering (unsupervised learning) respectively.

For the classification task, we have a batch of cat and dog image data, each image has its label, that is, the image is known to be a cat or a dog, then we can use these image data and label data to generalize a classification model, so that it can classify new cat and dog images; For clustering, we don’t have labels. So we have a bunch of images of dogs and cats, but we don’t know which image is the cat and which image is the dog. This time, because there is no label, we also did not have a “standard answer”, only to learn from the existing image data, explore potential rule, and the image data into multiple cluster, the cluster may also contain images of cats, dogs, but in general can distinguish between the cat and dog image data on the distribution of image data.

Image Segmentation

In computer vision, we are interested in how to recognize a set of pixels, which is called the image segmentation problem. For example, two people looking at the same optical illusion may see different things. Humans divide images intuitively. For example, it all depends on how the viewer splits the image while thinking. You may see zebras in the pictures below, or you may see lions.

One of the motivations behind image segmentation is to segment the image into coherent objects, as follows:

We may also want to split the image into many groups based on the similarity of nearby pixels, which are called “superpixels”. “superpixels” allow us to treat many individual pixels as a cluster for faster computations. Here is an example of a super-pixel split image.

Superpixel segmentation and other forms of segmentation are beneficial to image feature extraction. We can treat a group of pixels as a feature from which to obtain image information. In addition, image segmentation also facilitates some common photo effects, such as background removal. If we can segment an image correctly, we will be able to keep the group of pixels we want and delete other irrelevant groups of pixels.

Although image segmentation is very useful and has application requirements in a variety of scenarios, there is no “optimal” image segmentation method, and we must compare different image segmentation algorithms to find our best solution. If there are too many or too few groups of images, over-segmentation or undersegmentation will occur.

In order to solve the problem of image segmentation, we can regard image segmentation as clustering. Through clustering, we can effectively combine similar data points together and represent them with a singular value, which is very helpful for us to carry out further operations on the image or extract image features. However, the problems are as follows:

How do I determine if two pixels, pixel blocks, or images are similar
How to calculate local global clustering based on spatial information of image

For these problems, different clustering algorithms have different answers. Generally speaking, clustering can be divided into top-down and bottom-up. The top-down clustering algorithm will cluster into a cluster on the same visual entity. Bottom-up algorithms group locally related pixels together.

clustering

Clustering is widely used in prediction, analysis, classification and other aspects. This paper will take Kmeans as an example to introduce clustering algorithm. As shown in the figure below, there are three different color areas in the Input Image in the upper left corner. Therefore, the Image can be easily segmented by the histogram on the left. However, the histogram corresponding to the Input image in the lower left corner does not have a uniform color area. In order to segment the image, we can adopt Kmeans clustering algorithm.

Using Kmeans, our goal here is to identify three cluster centers as representative strengths and label each pixel according to its closest center. The best cluster centers are those that minimize the sum of the square distances between all points and their nearest cluster center

algorithm

Finding cluster centers and cluster members can be thought of as a “chicken and egg” problem. If we know the cluster center, we can assign points to the cluster by assigning each point to the nearest center. On the other hand, if we know the attributes of the group members, we can find the cluster center by calculating the mean of each cluster.

To find cluster centers and cluster members, we first initialize K cluster centers (K needs to be specified in advance), usually randomly generated cluster centers. Then, we run an iterative process to calculate the optimal number of iterations (i.eSmallest) cluster center and cluster member, or cluster center convergence.

The algorithm flow is as follows:

Initialize the cluster center..
Assign each point in the data set to the nearest center. Use Euclidean distance as the distance measure.
Update the cluster center to the average value of cluster members
Repeat steps 2-3 until the value of the cluster center stops changing or reaches the maximum number of iterations of the algorithm.

The algorithm flow chart is as follows:

In this paper, K-means clustering gives a very good application example. The author made a direction scale with the results of the 15 Asian football teams from 2005 to 2010, and then used the K-means algorithm (k =3) to cluster the teams into three categories, and the following results are obtained, which are very real.

Asia: Japan, South Korea, Iran, Saudi Arabia Asia: Uzbekistan, Bahrain, North Korea Asia: China, Iraq, Qatar, united Arab Emirates, Thailand, Vietnam, Oman, IndonesiaCopy the code

Next, we apply clustering to actual image segmentation.

Actual code

# segmentation.py
# python 3.6
import numpy as np
import random
from scipy.spatial.distance import squareform, pdist
from skimage.util import img_as_float

### Clustering Methods
def kmeans(features, k, num_iters=100):

    N, D = features.shape

    assert N >= k, 'Number of clusters cannot be greater than number of points'

    # Randomly initalize cluster centers
    idxs = np.random.choice(N, size=k, replace=False)
    centers = features[idxs]        # 1. Random center point
    assignments = np.zeros(N)

    for n in range(num_iters):
        ### YOUR CODE HERE
        # 2. Classification
        for i in range(N):
            dist = np.linalg.norm(features[i] - centers, axis=1)    # Distance between each point and the center point
            assignments[i] = np.argmin(dist)        The ith point belongs to the nearest center point

        pre_centers = centers.copy()
        # 3. Recalculate the center point
        for j in range(k):
            centers[j] = np.mean(features[assignments == j], axis=0)

        # 4. Verify whether the center point has changed
        if np.array_equal(pre_centers, centers):
            break
        ### END YOUR CODE

    return assignments

def kmeans_fast(features, k, num_iters=100):

    N, D = features.shape

    assert N >= k, 'Number of clusters cannot be greater than number of points'

    # Randomly initalize cluster centers
    idxs = np.random.choice(N, size=k, replace=False)
    centers = features[idxs]
    assignments = np.zeros(N)

    for n in range(num_iters):
        ### YOUR CODE HERE
        # Calculate distance
        features_tmp = np.tile(features, (k, 1))        # (k*N, ...)
        centers_tmp = np.repeat(centers, N, axis=0)     # (N * k, ...)
        dist = np.sum((features_tmp - centers_tmp)**2, axis=1).reshape((k, N))      Each column is k centers
        assignments = np.argmin(dist, axis=0)   # recently

        # Calculate the new center point
        pre_centers = centers
        # 3. Recalculate the center point
        for j in range(k):
            centers[j] = np.mean(features[assignments == j], axis=0)

        # 4. Verify whether the center point has changed
        if np.array_equal(pre_centers, centers):
            break
        ### END YOUR CODE

    return assignments



def hierarchical_clustering(features, k):

    N, D = features.shape

    assert N >= k, 'Number of clusters cannot be greater than number of points'

    # Assign each point to its own cluster
    assignments = np.arange(N)
    centers = np.copy(features)
    n_clusters = N

    while n_clusters > k:
        ### YOUR CODE HERE
        dist = pdist(centers)       # Calculate the distance between each other
        matrixDist = squareform(dist)   # Change vector form to matrix formmatrixDist = np.where(matrixDist ! = 0.0, e10 matrixDist, 1)# Change 0.0 to 1e10, i.e. remove the calculated distances for the same points in the matrix

        minValue = np.argmin(matrixDist)        The position of the smallest value
        min_i = minValue // n_clusters          # line number
        min_j = minValue - min_i * n_clusters   # column number

        if min_j < min_i:       # merge into smaller clusters
            min_i, min_j = min_j, min_i  # Swap

        for i in range(N):
            if assignments[i] == min_j:
                assignments[i] = min_i     # Merge the two

        for i in range(N):
            if assignments[i] > min_j:
                assignments[i] -= 1     # A cluster is merged, so the n_clusters are reduced by one

        centers = np.delete(centers, min_j, axis=0)  # Reduce one
        centers[min_i] = np.mean(features[assignments == min_i], axis=0)        # recalculate the center point

        n_clusters -= 1     # minus 1

        ### END YOUR CODE

    return assignments


### Pixel-Level Features
def color_features(img):
    H, W, C = img.shape
    img = img_as_float(img)
    features = np.zeros((H*W, C))

    ### YOUR CODE HERE
    features = img.reshape(H * W, C)        # color as a feature
    ### END YOUR CODE

    return features

def color_position_features(img):
    H, W, C = img.shape
    color = img_as_float(img)
    features = np.zeros((H*W, C+2))

    ### YOUR CODE HERE
    # coordinates
    cord = np.dstack(np.mgrid[0:H, 0:W]).reshape((H*W, 2))      # mgrid generates coordinates, reformatted as (x,y) two-dimensional
    features[:, 0:C] = color.reshape((H*W, C))      # r,g,b
    features[:, C:C+2] = cord
    features = (features - np.mean(features, axis=0)) / np.std(features, axis=0,  ddof = 0)     # Normalized processing of features
    ### END YOUR CODE

    return features

def my_features(img):
    """ Implement your own features Args: img - array of shape (H, W, C) Returns: features - array of (H * W, C) """
    features = None
    ### YOUR CODE HERE
    features = color_position_features(img)
    ### END YOUR CODE
    return features


### Quantitative Evaluation
def compute_accuracy(mask_gt, mask):
    accuracy = None
    ### YOUR CODE HERE
    mask_end = mask_gt - mask
    count = len(mask_end[np.where(mask_end == 0)])
    accuracy = count / (mask_gt.shape[0] * mask_gt.shape[1])
    ### END YOUR CODE

    return accuracy

def evaluate_segmentation(mask_gt, segments):
    num_segments = np.max(segments) + 1
    best_accuracy = 0

    # Compare the segmentation result with the real value
    for i in range(num_segments):
        mask = (segments == i).astype(int)
        accuracy = compute_accuracy(mask_gt, mask)
        best_accuracy = max(accuracy, best_accuracy)

    return best_accuracy
Copy the code

# test.py
import  numpy as np
from scipy.spatial.distance import  pdist, squareform

if __name__ == "__main__": a = np array ([,1,1,1 [1], [1,0,0,0]]) b = np. Array ([,0,0,1 [1], [1,1,0,0]]) c = (a = = 1)print(c)

Copy the code

# utils.py
import numpy as np
import matplotlib.pyplot as plt
from skimage.util import img_as_float
from skimage import transform
from skimage import io

from segmentation import *

import os

def visualize_mean_color_image(img, segments):

    img = img_as_float(img)
    k = np.max(segments) + 1
    mean_color_img = np.zeros(img.shape)

    for i in range(k):
        mean_color = np.mean(img[segments == i], axis=0)
        mean_color_img[segments == i] = mean_color

    plt.imshow(mean_color_img)
    plt.axis('off')
    plt.show()

def compute_segmentation(img, k,
        clustering_fn=kmeans_fast,
        feature_fn=color_position_features,
        scale=0):
    ""Firstly, a feature vector is extracted from each pixel of the image. Then the clustering algorithm is applied to the set of all feature vectors. Two pixels will be assigned to the same cluster if and only if their feature vectors are assigned to the same cluster. """

    assert scale <= 1 and scale >= 0, \
        'Scale should be in the range between 0 and 1'

    H, W, C = img.shape

    if scale > 0:
        # Zoom out the image for faster computation
        img = transform.rescale(img, scale)

    features = feature_fn(img)
    assignments = clustering_fn(features, k)
    segments = assignments.reshape((img.shape[:2]))

    if scale > 0:
        # Resize split back to the original size of the image
        segments = transform.resize(segments, (H, W), preserve_range=True)

        Resize causes pixel values not to overlap.
        The pixel value is rounded to the nearest integer
        segments = np.rint(segments).astype(int)

    return segments


def load_dataset(data_dir):
    """Load data set 'imgs/aaa.jpg' is 'gt/aaa.png'"""

    imgs = []
    gt_masks = []

    for fname in sorted(os.listdir(os.path.join(data_dir, 'imgs'))) :if fname.endswith('.jpg') :# Read the image
            img = io.imread(os.path.join(data_dir, 'imgs', fname))
            imgs.append(img)

            # Load the corresponding segmentation mask
            mask_fname = fname[:-4] + '.png'
            gt_mask = io.imread(os.path.join(data_dir, 'gt', mask_fname)) gt_mask = (gt_mask ! = 0).astype(int)# binary mask
            gt_masks.append(gt_mask)

    return imgs, gt_masks

Copy the code

Operational embodiment

# initializationfrom time import time import numpy as np import matplotlib.pyplot as plt from matplotlib import rc from skimage import IO from __future__ import print_function %matplotlib inline plt.rcparams ['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Automatically reload external modules
%load_ext autoreload
%autoreload 2
Copy the code

# Generate random data points for clustering

# Use seed to ensure consistent results
np.random.seed(0)

Clustering # 1Mean1 = [-1, 0] cov1 = [[0.1, 0], [0, 0.1]] X1 = np.random. Multivariate_normal (mean1, cov1, 100)# cluster 2Mean2 = [0, 1] cov2 = [[0.1, 0], [0, 0.1]] X2 = np.random. Multivariate_normal (mean2, cov2, 100)# 3 clusteringMean3 = [1, 0] cov3 = [[0.1, 0], [0, 0.1]] X3 = np.random. Multivariate_normal (mean3, cov3, 100)# 4 clusteringMean4 = [0, -1] cov4 = [[0.1, 0], [0, 0.1]] X4 = np.random. Multivariate_normal (mean4, coV4, 100)# Merge two sets of data points
X = np.concatenate((X1, X2, X3, X4))

Draw data points
plt.scatter(X[:, 0], X[:, 1])
plt.axis('equal')
plt.show()
Copy the code

from segmentation import kmeans

np.random.seed(0)
start = time()
assignments = kmeans(X, 4)
end = time()

kmeans_runtime = end - start

print("kmeans running time: %f seconds." % kmeans_runtime)

for i in range(4):
    cluster_i = X[assignments==i]
    plt.scatter(cluster_i[:, 0], cluster_i[:, 1])

plt.axis('equal')
plt.show()
Copy the code

Kmeans running time: 0.027956 seconds

from segmentation import hierarchical_clustering

start = time()
assignments = hierarchical_clustering(X, 4)
end = time()

print("hierarchical_clustering running time: %f seconds." % (end - start))

for i in range(4):
    cluster_i = X[assignments==i]
    plt.scatter(cluster_i[:, 0], cluster_i[:, 1])

plt.axis('equal')
plt.show()
Copy the code

Hierarchical_clustering Running time: 0.793070 seconds.

Before using the clustering algorithm to segment the image, we have to calculate some feature vectors for each pixel. Each pixel’s feature vector should encode the qualities we care about in good segmentation. More specifically, for a pair of corresponding eigenvectorsandThe pixelandIf forandOn the same cluster, thenandThe distance between them should be small; If not, it should be large.

Load and display the image
img = io.imread('train.jpg')
H, W, C = img.shape

plt.imshow(img)
plt.axis('off')
plt.show()
Copy the code

The simplest feature vector of a pixel is its color vector. The output is shown below:

from segmentation import color_features
np.random.seed(0)

features = color_features(img)

# Result detection
assert features.shape == (H * W, C),\
    "Incorrect shape! Check your implementation."

assert features.dtype == np.float,\
    "dtype of color_features should be float."

assignments = kmeans_fast(features, 8)
segments = assignments.reshape((H, W))

Display image segmentation results
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()
Copy the code

We replace each pixel with the average color of its cluster, and the result is as follows:

from utils import visualize_mean_color_image
visualize_mean_color_image(img, segments)
Copy the code

However, we can find that the results obtained in this way do not consider the spatial information of the image, and only perform clustering purely from the color. And we can connect the color and position of the pixels in the image. In short, for being in the imageColor of positionPixel, whose feature vector is. The dynamic range of color and position can vary greatly. For example, each color channel of an image may be in the range [0,255], and each pixel position may have a larger range. Uneven scaling between different features in feature vectors may lead to poor performance of clustering algorithm. One way to correct the difference in dynamic range between different features is to standardize the feature vectors.

from segmentation import color_position_features
np.random.seed(0)

features = color_position_features(img)

# Result detection
assert features.shape == (H * W, C + 2),\
    "Incorrect shape! Check your implementation."

assert features.dtype == np.float,\
    "dtype of color_features should be float."

assignments = kmeans_fast(features, 8)
segments = assignments.reshape((H, W))

Image segmentation results display
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()
Copy the code

Although there is still the problem of uneven segmentation, it is far better than the previous results without considering the spatial information.

visualize_mean_color_image(img, segments)
Copy the code

In order to quantitatively evaluate the performance of clustering algorithm, we can carry out quantitative evaluation. We took a small dataset of cat images and split them into foreground (CAT) and background (background). We will quantitatively evaluate different segmentation methods (features and clustering methods) on this data set.

from utils import load_dataset, compute_segmentation
from segmentation import evaluate_segmentation

Load the small data set
imgs, gt_masks = load_dataset('./data')

Set the parameters of image segmentationNum_segments = 3 Clustering_fn = kmeans_fast Feature_fn = color_features scale = 0.5 mean_accuracy = 0.0 segmentations = []for i, (img, gt_mask) in enumerate(zip(imgs, gt_masks)):
    # Compute a segmentation for this image
    segments = compute_segmentation(img, num_segments,
                                    clustering_fn=clustering_fn,
                                    feature_fn=feature_fn,
                                    scale=scale)
    
    segmentations.append(segments)
    
    # Evaluate image segmentation results
    accuracy = evaluate_segmentation(gt_mask, segments)
    
    print('Accuracy for image %d: %0.4f' %(i, accuracy))
    mean_accuracy += accuracy
    
mean_accuracy = mean_accuracy / len(imgs)
print('the Mean accuracy: % 0.4 f' % mean_accuracy)
Copy the code

The following output is displayed:

Accuracy forImage 0:0.8612 AccuracyforImage 1:0.9571 AccuracyforImage 2:0.9824 AccuracyforImage 3:0.9206 AccuracyforImage 4:0.7642 AccuracyforImage 5:0.8062 AccuracyforImage 6:0.6617 AccuracyforImage 7:0.4726 AccuracyforImage 8:0.8317 AccuracyforImage 9:0.7580 AccuracyforImage 10:0.6515 AccuracyforImage 11:0.8261 AccuracyforImage 12:0.7105 AccuracyforImage 13:0.6667 AccuracyforImage 14:0.7623 AccuracyforImage 15: 0.5223 Mean accuracy: 0.7597Copy the code

# Visualization of image segmentation resultsN = len (imgs) PLT. Figure (figsize = (15 '))for i in range(N):

    plt.subplot(N, 3, (i * 3) + 1)
    plt.imshow(imgs[i])
    plt.axis('off')

    plt.subplot(N, 3, (i * 3) + 2)
    plt.imshow(gt_masks[i])
    plt.axis('off')

    plt.subplot(N, 3, (i * 3) + 3)
    plt.imshow(segmentations[i], cmap='viridis')
    plt.axis('off')

plt.show()
Copy the code

You might still be interested

Basic Skills of scientific research — nanny level course of efficient literature retrieval and literature reading

Computational vision – image, quality, evaluation

Color vision — a miscellany

Computational vision — Image defogging algorithm based on dark channel prior

Content-based image scaling

Image hand-painted/sketch style conversion

Local feature information and panoramic image Mosaic

Image gradient, image edge, geometric feature, detection and extraction

The code and data set for this blog can be found in my Github repository.

Finally ask for praise to pay attention to Erlian, your praise and attention can let more people see this article, crab crab!

Computational vision: Unsupervised learning and Image Segmentation

Supervised Learning and Unsupervised Learning

Image Segmentation

clustering

algorithm

Actual code

Operational embodiment

You might still be interested

Related Posts

Nasdaq 2021 Tech Trends Report, Byte reaches $92 million privacy deal with US users, 2020 Most popular APP | Decode the Week

What the hell is AOP?

Use Nginx to build web server