Preface:

This paper introduces the main characteristics of Siamese (Siamese) network, the steps of training and testing Siamese network, the application of Siamese network, the advantages and disadvantages of Siamese network, why Siamese is called one-shot classification, and the loss function of Siamese.

Main features of Siamese Network

1. Siamese networks take two different inputs through two similar sub-networks with the same architecture, parameters and weights.

2. The two subnets are mirror images of each other, just like conjoined twins. Therefore, any changes to any subnet architecture, parameters, or weights also apply to other subnets.

3. Two subnetworks output a code to calculate the difference between the two inputs.

4. The goal of Siamese networks is to use similarity scores to classify whether two inputs are the same or different. The similarity score can be calculated using binary cross entropy, contrast function, or triplet loss, which are techniques used in general distance metric learning methods.

5. Siamese network is a one-shot classifier that uses discriminant features to generalize unfamiliar categories from unknown distributions.

Train the Siamese neural network

1. Load data sets containing different classes

2. Create positive and negative data pairs. A positive data pair when two inputs are the same, and a negative data pair when two inputs are different.

3. Construct convolutional neural network, which uses full-connection layer to output feature coding. We will pass two inputs through sister CNN. Sister CNN should have the same architecture, hyperparameters, and weights.

4. Construct the difference layer to calculate the Euclidean distance between the coding outputs of two sister CNN networks.

5. The last layer is the fully connected layer with a single node, and the sigmoID activation function is used to output the similarity score.

6. Use binary cross entropy as the loss function.

Test the Siamese neural network

1. Send two inputs to the training model to output the similarity score.

2. Since the last layer uses the sigmoID activation function, it outputs a value in the range of 0 to 1. A similarity score of close to 1 means that the two inputs are similar. A similarity score near zero means that the two inputs are not similar. A good rule of thumb is to use a similarity cutoff threshold of 0.5.

Application of Siamese neural network

1. Verify the signature

Facial recognition

3. Compare fingerprints

4. Assess the severity of disease according to clinical grading

5. Text similarity of working data to restore matching

6. Text similarity for matching similar questions

Why is Siamese neural network called one-shot classification?

1. The one-shot classification model can predict correctly only by using a single training sample of each new category.

2. Siamese networks use supervised training methods to learn common input features, and then make predictions for unknown class distributions based on training data.

3. Trained Siamese networks use one-time learning to predict similarity or dissimilarity between two inputs, even if there are few samples in these new distributions.

Advantages of Siamese networks

1. Siamese network is a one-shot classification model that requires only one training sample for prediction.

2. More robust to category imbalance because it requires little information. It can be used for data sets with very few samples of certain classes.

3. The one-shot learning feature of Siamese network does not rely on domain specific knowledge, but makes use of deep learning technology.

Disadvantages of Siamese networks

Output only similarity score but not probability. : The total probability of mutually exclusive events is 1. And the distance is not limited to less than or equal to 1.

Why do sister networks have to be the same?

The model architecture, hyperparameters and weights of the two sister convolutional networks must be the same. It ensures that their respective networks do not map two extremely similar images to very different feature Spaces, because each network computes the same function.

Loss functions used by Siamese networks

1. Siamese network uses similarity score to predict whether two inputs are similar or different, and uses metric learning method, which finds the relative distance between their inputs.

2. The similarity score can be calculated using binary cross entropy, contrast function or triple loss.

3. Siamese network performs binary classification, classifying inputs into similar or dissimilar categories; Therefore, binary cross entropy loss functions are used by default.

Contrast loss function

1. Contrast loss function Distinguishes between similar and different images by comparing two inputs. It helps when you’re training without knowing all the categories and training data is limited. It creates a data encoding that can be used when new classes are available in the future.

2. Comparison of loss requires a pair of positive and negative training data. Positive pair contains an Anchor sample and a positive sample, while negative pair contains an Anchor sample and a negative sample.

3. The objective of the comparative loss function is to make the positive pair have a smaller distance and the negative pair have a larger distance.

In the above equation, Y is 0 when the input comes from the same class; Otherwise, the value of Y is 1.

M is the margin that defines the radius to indicate that different pairs beyond this margin do not cause loss and are always greater than 0.

Dw is the Euclidean distance between outputs of sister Siamese networks.

Triplet Loss

In Triplet Loss, we use triples of data instead of binary pairs. The triad consists of Anchor, positive sample and negative sample, and is mainly used for facial recognition.

In Triplet loss, the distance between anchor and positive sample code is minimized, while the distance between anchor and negative sample code is maximized.

Triplet Loss pushes D (a,p) to 0, and D (a,n) is greater than D (a,p)+margin

conclusion

The Siamese network, inspired by the Siamese Twins, is a one-shot classification used to distinguish between similar and different images. It can be used even if you do not know all the training classes and training data is limited. Siamese networks are based on metric learning methods, which use binary cross entropy or contrast loss or triple loss to find relative distances between their inputs.

The original link

Medium.com/swlh/one-sh…

A summary PDF of the following articles can be obtained by replying to the keyword “Technical Summary” in the public account.

Other articles

Summary of computer vision terms (a) to build the knowledge system of computer vision

Summary of under-fitting and over-fitting techniques

Summary of normalization methods

Summary of common ideas of paper innovation

Summary of efficient Reading methods of English literature in CV direction

A review of small sample learning in computer vision

A brief overview of intellectual distillation

Optimize the read speed of OpenCV video

NMS summary

Loss function technology summary

Summary of attention mechanism technology

Summary of feature pyramid technology

Summary of pooling techniques

Summary of data enhancement methods

Summary of CNN structure Evolution (I) Classical model

Summary of CNN structural evolution (II) Lightweight model

Summary of CNN structure evolution (iii) Design principles

How to view the future trend of computer vision

Summary of CNN visualization technology (I) – feature map visualization

Summary of CNN visualization Technology (ii) – Convolutional kernel visualization

CNN Visualization Technology Summary (iii) – Class visualization

CNN Visualization Technology Summary (IV) – Visualization tools and projects