In this paper, the FreeAnchor method is proposed to carry out more free matching, instead of the previous method of rigor assigning the matching relation between Anchor and GT according to IoU. This method defines the training of target detection as the process of maximum likelihood estimation (MLE), and simultaneously learns target classification, target detection and matching relation end to end. From the experiment, The effect is very significant source: Xiaofei algorithm engineering notes public account

FreeAnchor: Learning to Match Anchors for Visual Object Detection

  • Thesis Address:Arxiv.org/abs/1909.02…
  • Thesis Code:Github.com/zhangxiaoso…

Introduction


Conventional target detection network carries out the matching between Anchor and GT based on IoU, but it will face the following problems:

  • For objects with non-central features, such as slender objects, spatial alignment cannot ensure that anchor covers enough object features, resulting in the decline of classification and detection performance.
  • When the detection target is dense, it is not feasible to use IoU as matching criterion.

The above problems all come from the matching between the preset Anchor and GT, without considering the output of the network. Therefore, this paper proposes a learning-based matching method, which defines the matching process as the process of maximization likelihood estimation, and simultaneously learns target classification, target detection and matching relationship end-to-end, achieving good results. The main contributions of this paper are as follows:

  • Define the training process of detection algorithm as the process of maximum likelihood estimation, and change the manually set anchor and GT matching to the free Anchor matching, break the constraint of IoU and allow GT to choose anchors according to the maximum likelihood criterion.
  • Define detection custom likelihood and realize end-to-end detection and classification training mechanism. Maximize likelihood can promote network learning how to match the optimal Anchor and ensure compatibility with NMS algorithm.

The Proposed Approach


In order to study the anchor and GT, the matching relation between first transforms the training of target detection algorithm for maximum likelihood estimation process, from the point of maximum likelihood optimization classification and detection, then define a custom likelihood, by ensuring that the recall rate and accuracy for the optimization of matching relation between, in the training phase, will detect custom custom loss likelihood is converted to detection, It can effectively learn object classification, object detection and matching relationship from end to end simultaneously.

Detector Training as Maximum Likelihood Estimation

The loss function of the conventional one-stage detection algorithm is shown in Formula 1,...Is the parameters learned from the network,Refers to the anchorMatch GT, the value is 1 only when the IoU of both is greater than the threshold. When the Anchor conforms to multiple GT, the GT with the largest IoU is selected...

From the perspective of maximum likelihood estimation (MLE), the loss functionTo the likelihood probability of formula 2,andIs the classification confidence degree,To locate confidence, minimizeMaximizing likelihood probability.

Although Formula 2 optimizes the classification and positioning of anchor strictly from the perspective of maximum likelihood estimation, how to learn the matching matrix is ignored, the current detection algorithm solves this problem by matching THE IoU index, without considering the optimization of the matching relationship between GT and Anchor.

Detection Customized Likelihood

To optimize the matching rules between GT and anchor, this paper adds detection Customized likelihood to CNN target detection framework, combines accuracy and recall rate, and maintains adaptation to NMS.

First construct each GTThe anchor with a higher IoU ofAnd then learn how to achieve the best match while maximizing detection custom likelihood.

In order to optimize the recall rate, ensure that each GT has at least one corresponding anchor first, as shown in Formula 3, select the anchor with the best performance in classification and detection in the candidate set of each GT.

In order to optimize the accuracy, the detector needs to classify the anchor with poor positioning into the background class. The objective function is shown in Formula 4, which means that the top Anchor should not be the background as far as possible.forThe probability of not matching all GT,For the anchorPredict GT correctlyThe probability. To be compatible with NMS,The following attributes must be met:

  • Is a monotonic increasing function associated with OU
  • When Anchor and GT are less than the threshold,Close to zero
  • For each GT, there is only one anchor satisfied

The property of is summarized as the Saturated Linear function, i.e.

According to the above definition, test custom likelihood is defined as Formula 5, which combines recall rate and accuracy rate and is compatible with NMS. By optimizing likelihood, recall rate and accuracy rate can be maximized at the same time to achieve free matching of GT and Anchor.

Anchor Matching Mechanism

In order to effectively learn the matching relationship, the test customization likelihood of Formula 5 is converted into the test customization loss function, as shown in Formula 5.The function is used to select the most suitable anchor for each GT. During training, from the candidate setTo select an anchor for network parametersThe update.

At the early stage of training, due to random initialization, the confidence of each anchor is very small, which cannot represent the quality of anchor. Therefore, mean-max function is used to select anchor.

When the training is insufficient, mean-max function can approach the Mean function, that is, almost all anchors can be used for training. As the training becomes more and more sufficient, mean-max function approaches Max function, which is finally equivalent to Max function, that is, the best Anchor is selected for training.

Replace the Max function of Formula 6 with mean-max function, add focal Loss to the second term, and at the same time, carry out the two terms respectivelyandWeighted, the final test customized loss function is shown in Formula 7,For the candidate setIs the likelihood set of,...

Combined with the customized loss function of detection, the training process of detector is as algorithm 1.

Experiments


The experimental FreeAnchor implementation simply modifies the loss function to the detection custom loss function proposed in the paper based on the RetinaNet.

Learning-to-match

Compatibility with NMS

Parameter Setting

The experiments of hyperparameters are as follows:

  • Anchor bag size Contrast,Among them, 50 has the best effect.
  • Background IoU threshold .The confidence of, the contrast0.6 is the best.
  • Focal Loss Parameter, contrastand.andCombinations work best.
  • Loss regularization factor , the weight used to balance classification and positioning loss in Formula 1, 0.75 is the best.

Detection Performance

CONCLUSION


In this paper, the FreeAnchor method is proposed to carry out more free matching, instead of the previous method of rigor assigning the matching relation between Anchor and GT according to IoU. This method defines the training of target detection as the process of maximum likelihood estimation (MLE), and simultaneously learns target classification, target detection and matching relation end to end. From the experiment, The effect was remarkable.





If this article is helpful to you, please click a like or watch it. For more content, please follow the wechat public account.