Abstract: This paper interpreted the Gaussian Bounding Boxes and ProbabilisticIntersection – over – the Union for Object Detection, and the paper target Detection task, A new Gaussian detection frame (GBB) and a new method to calculate object similarity (ProbIoU) are proposed.

This article is shared in huawei Cloud community “Paper Interpretation Series 19: Gaussian Detection Frame and ProbIoU for target detection” by BigDragon.

Address: arxiv.org/abs/2106.06…

Github address: github.com/ProbIOU

The existing improvement direction of target detection mainly focuses on: Training larger data sets (LVISdataset), processing category imbalance, proposing better backbones, building a long distance interaction model (Transformers, LambdaNetworks), classification and the tradeoff analysis of the detection frame, there are few studies on the presentation form of the detection frame. In the existing target detection tasks, horizontal frame (HBB) and rotating frame (OBB) are the main ones, and the presentation form is still rectangular or rectangular like. The existing calculation methods of target distance and similarity include: Intersection over Union, GIoU(Generalized IoU), DIoU(Distance IoU), PIoU(Pixel IoU), GaussianWasserstein short (GWD).

Compared with THE HBB algorithm, the existing OBB algorithm has improved in the detection of slender and rotating objects, but its fit degree with the target semantic segmentation is not high. Therefore, this paper proposes a segmentation presentation form that is more fit with the semantic segmentation form and the corresponding target similarity calculation method.

The contributions of this paper are as follows:

  • A new Kind of GaussianBounding Boxes (GBB) is proposed.

GBB is closer to the target’s semantic segmentation mask shape, and is more suitable for non-rectangular targets. The detection effect of non-rectangular targets is better than HBB and OBB.

  • A new method for calculating the similarity of targets (ProbIoU) was proposed.

ProbIoU based on Hellinger Distance considers the characteristics of 2D Gaussian distribution, meets all Distance measurement standards, can represent the real Distance between different distributions, and is differentiable everywhere, and can improve the detection effect of OBB and HBB targets.

1.Gaussian Bounding Boxes (GBB)

To determine a two-dimensional Gaussian distribution in a 2-dimensional region, it is necessary to calculate its mean μ and covariance matrix ∑, where μ is (x0, y0) T, and the covariance matrix ∑ can be calculated by the following formula. In the target detection task, (x0, y0, A, B, C) can be directly set as the parameters in the regression task of target detection, or the parameters in the regression task can be expressed as (x0, y0, a ‘, b ‘, θ), and the latter form is more consistent with the output form of the existing rotating detection frame.

Assuming that

The transformation from horizontal and rotating frames to Gaussian frames follows the following assumption: The target region is a 2-dimensional binary region ω, and ω conforms to the uniform probability distribution, then the mean μ and covariance matrix ∑ of the distribution can be calculated by the following formula.

Where N represents the area of the region OMEGA.

1.1 Convert HBB to GBB

For HBB, the binary region ω is a rectangular region centered on (x0, y0) with height H and width W, so μ is (x0, y0). Its covariance matrix sigma can be calculated by the following formula

So a is equal to w squared over 12, b is equal to H squared over 12, and c is equal to 0. As shown in the above formula, the converted Gaussian frame can also be converted into horizontal frame, and the process is reversible.

1.2 Convert OBB to GBB

It is necessary to calculate (a ‘, b ‘, θ) when OBB is converted to GBB, as shown in the figure below. The variances A ‘and B’ can be calculated by converting the rotating box to the horizontal box, and the covariance matrix can be calculated by the following formula.

1.3 Polygon Box (PBB) converted to GBB

The conversion of polygonal frame to Gaussian frame can be calculated according to the following formula:

2. ProbIoU and positioning loss function

2.1 ProbIoU

Bhattacharyya Distance (BD)

The Bhattacharyya Coefficient(BC) is used to calculate the similarity between different GBBS. BC between the two probability density functions p(x) and q(x) is calculated by the following formula:

Where BC (p, q) ∈ [0,1], BC (p, q)=1 if and only if the two distributions are the same.

Based on the above BC (p, q), BhattacharyyaDistance (BD) between different distributions can be obtained, and BD between the two probability density functions p(x) and q(x) can be calculated according to the following formula:

When p N (μ1, σ 1), Q N (μ2, σ 2) and the actual problem in target detection is 2-dimensional vector and matrix, the Babbitan distance BD can be calculated by the following formula:

Hellinger Distance (HD)

Since Bhattacharyya Distance does not satisfy the triangle inequality, it is not a real Distance. Therefore, to represent the real Distance, Hellinger Distance (HD) is adopted, and its formula is as follows:

Where HD(p, q) ∈ [0,1], if and only if the two distributions are the same, HD(p, q)=0.

Probabilistic IoU (ProbIoU)

Based on the Hellinger Distance mentioned above, this paper proposed the gaussian distribution similarity calculation method ProbIoU, whose specific calculation formula is as follows:

2.2 Locating loss function

Assuming that the predicted GBB is P =(x1, y1, A1, B1, C1) and the real GBB is P =(x2, y2, A2, B2, C2), its loss function is shown as follows:

However, when GBB is predicted to be far from the real GBB, the value of L1 loss function is close to 1, and the training process generates small gradient and slow convergence rate. L2 loss function avoids the above problems, but its geometric relationship with IoU is weak. Therefore, it is recommended to use L2 loss function training first, and then switch to L1 loss function.

2.3 ProbIoU features

ProbIoU based on Hellinger Distance has the following characteristics:

  • All parameters in the three functions are differentiable;

  • Helinger Distance satisfies all Distance metrics;

  • The loss function is invariant to object scaling.

3. Experimental results

3.1 Experimental results of different detection frames

After training on COCO2017, IoU detected by GBB, OBB and HBB are compared, and the following conclusions can be drawn:

  • The mean IoU of GBB in COCO 2017 was higher than that of HBB and OBB

  • GBB is inferior to HBB and OBB in trafficlight, Microwave and TV categories

3.2 ProbIoU Loss improves HBB and OBB detection

The probiou-based loss function was applied to HBB detection task, and EfficientDet D0 and SSD 300 were respectively trained in Pascal-VOC 2007 dataset. As shown in the following table, compared with IoU, using ProbIoU method can improve AP and AP75, and using probiou-based loss function model can achieve higher AP.

Probiou-based loss functions are used in OBB detection tasks with R-50 Retinanet and R-50 R3Det, respectively, trained in DOTA V1 and HRSC2016 data sets. As shown in the following table, probiou-based loss function AP was 2% higher than GWP-RET in DOTA V1 data set with Retinanet model. With R3Det model, the results are close to GWD-REP and GWD-RET. On HRSC2016 data set, the loss function results based on ProbIoU are comparable to gwD-REP and better than GWD-RET.

4. To summarize

The approach presented in this article has three important parts:

  • Detection frame with Gaussian Distribution (GBB)

  • ProbIoU based on Hellinger Distance was proposed, and corresponding loss functions L1 and L2 were proposed

  • In the training process, the combination of L1 and L2 loss function is more effective

The method limitations presented in this article include the following two parts:

  • For equiaxial Gaussian distribution, the rotation Angle cannot be determined

  • For slender targets, the gradient is easy to be too large during training, resulting in unstable training.

For more AI technology dry goods, welcome to huawei cloud AI zone, currently there are AI programming Python and other six combat camps for everyone to learn for free

Click to follow, the first time to learn about Huawei cloud fresh technology ~