Navigation, driver assistance, autonomous driving and other technologies continue to develop on the map of the precision of higher requirements. Conventional road-level map has many deficiencies for intelligent transportation system. In order to meet the requirements of autonomous driving application, we propose a method of making high-precision map by using visual inertial navigation technology.

This paper will first introduce the mainstream equipment of vision and inertial navigation, the framework and key technologies of visual inertial navigation integration, autonavi’s calculation scheme for generating road signs and ground signs of high precision map based on vision, and finally summarize the challenges and future development direction of this technology in high precision map accuracy.

Visual inertial navigation technology has broad prospects

High-precision map is one of the core technologies of autonomous driving. Accurate map is crucial to the positioning, navigation and control of autonomous vehicles, as well as safety. With the continuous development of autonomous driving, more and more car companies are choosing to cooperate with map suppliers. High precision maps need to consider the issues of scale and real-time, autonavi provides large-scale data services for different brands of vehicles, has a leading edge in the high precision map industry.

At present, Autonavi has completed the high-precision map data of more than 320,000 kilometers of high-grade roads in China, using two methods of combined lidar acquisition and image visual inertial navigation fusion.

On the one hand, the cost of data acquisition can be greatly reduced by combining image visual inertial navigation. On the other hand, the high-precision map based on image vision has certain advantages in recognition and can improve the efficiency of lane level elements operation. Therefore, this technique has broad prospects in the large-scale production of high precision maps.

Visual inertial Navigation hardware tools

Visual equipment

Cameras can be divided into three categories, Monocular, Stereo, and RGB-D, depending on how they work.

Monocular camera has simple structure and low cost. The disadvantage of monocular camera is that the photo is a mapping plane from 3D to 2D, and it lacks depth information. The distance between the object in the scene and us cannot be calculated by a single picture.

Binocular cameras consist of two monocular cameras, but the distance from each other (baseline) is known. We estimate the spatial position of each pixel from the baseline. The depth range measured by binocular camera is related to the baseline. The larger the baseline distance is, the farther it can be measured.

So the binocular camera on a drone is usually a big one. Its disadvantage is that its configuration and calibration are complicated, its depth range and accuracy are limited by the baseline and resolution of binocular vision, and the calculation of parallax consumes much computing resources.

The principle of a depth camera is to use infrared structured light, similar to a laser sensor, to actively emit light to an object and receive the returned light to measure the distance between the object and the camera. This part is not solved by software calculation like binocular camera, but by physical measurement means, so compared with binocular camera can save a lot of calculation.

The disadvantages of depth camera are that it may have many problems, such as narrow measurement range, large noise, small field of vision, easy to be interfered by sunlight, and unable to measure transmission material, etc., which makes it difficult to apply in outdoor scenes.

In view of the demand of mass production of high-precision map, monocular camera is the mainstream high-precision map vision equipment because of its low cost and simple installation.

Inertial navigation equipment

Inertial navigation system (INERTIAL navigation system) is an autonomous navigation system which does not depend on external information and does not radiate energy to the outside. The working environment includes not only air, ground, but also underwater.

The basic working principle of inertial navigation is based on Newton’s laws of mechanics, by measuring the acceleration of the carrier in the inertial frame of reference, will it to integral time, and convert it into the navigation system, can get in the navigation system of the information such as speed, yaw Angle and position, are widely used in military, surveying and mapping, resource exploration, robot, automatic driving, etc.

Inertial navigation system has the advantages of anti-interference, strong autonomy, high data frequency and good stability. According to the drift rate from small to large, it can be divided into navigation level, tactical level, industrial level, vehicle-mounted level and consumer level. At present, tactical inertial navigation equipment is mostly used in the field of automatic driving and high-precision map making to meet the demand of high-precision positioning.

In addition, the inertial navigation system has developed flexible inertial navigation, fiber optic inertial navigation, laser inertial navigation, memS inertial navigation and other modes. Among them, micro-electro-mechanical Systems (MEMS) are widely used because of their advantages of small size, light weight, low power consumption, low price and impact resistance. At present, MEMS has been extended to the tactical applications of medium and low precision.

Inertial Navigation System will have accumulated errors when used alone. In practical application, it is usually combined with Global Navigation Satellite System (GNSS) and other auxiliary systems represented by GPS and Beidou to obtain the Global position of the carrier.

When the satellite signal is lost, inertial navigation integration can be used to obtain more accurate real-time attitude estimation. For surveying and mapping applications that do not require real time, smoothing algorithm can obtain higher positioning accuracy.

In the field of mobile mapping, another function of INERTIAL navigation is to coordinate with external sensors such as lasers and cameras. The carrier pose obtained by coupling with GNSS can provide high precision and high frequency positioning for image pose and laser pulse transmitting pose. After external calibration between sensors, the corresponding information is projected to the global THREE-DIMENSIONAL coordinate system.

Another combination of Inertial navigation is the Visual Inertial Odometry (VIO) coupled to a Visual sensor. The visual sensor has a better SLAM effect in scenes with rich textures, but it will fail in scenes with moving objects occupying the main body of the photo or fewer features.

The integration of inertial data can improve the overall positioning accuracy and continuity. MEMS inertial Navigation units are widely used in smartphones, and both Apple’s ARkit and Google’s ARcore frameworks provide VIO implementations to support augmented reality applications.

Multi-sensor integrated positioning and navigation scheme has become a trend. The integrated navigation system composed of inertial navigation system (INS) and GNSS first and then combined with image and lidar sensors is the research hotspot and development direction in the field of automatic driving and high-precision map making.

Visual inertial navigation framework and key technologies

At present, the mainstream visual inertial navigation fusion framework is divided into two parts: front end and back end. The front end extracts sensor data to build a model for state estimation. The back end optimizes according to the data provided by the front end, and finally outputs the position, attitude and global map of the camera. The architecture is shown in the figure below:

Autonavi high precision map production technical scheme

The production of high precision map is mainly carried out from two types of elements, one is road signs, such as road guidance signs, traffic lights and so on; One is ground marking, such as lane dividing line, guiding arrow and so on. In both types of map elements, the location should be calculated first, and then the elements should be associated with the road network to obtain the attribute information and geometric information of the elements.

The production of map elements combines manual operation with automatic extraction. First of all, through the field data collected by the image and calculating trajectory, get automated visual inertial navigation information needed, according to the visual inertial fusion technology to generate the map elements, on the basis of automatic map using artificial Web edit model, improve the accuracy of map elements, the storage to the corresponding database.

Looking forward to

There are a lot of high-precision map production schemes based on inertial vision, domestic and foreign companies like Moment, wide bench technology, LVL5 and so on are studying, but from the current market, due to equipment cost constraints, the precision limit of high-precision map based on vision is 10cm.

In the future, the development of high precision vision-based maps may be towards the direction of multi-source data fusion, that is, multiple acquisition of the same road and multiple acquisition of data sources from different devices are fused together to improve the accuracy and improve the timeliness of map update.

Autonavi is rooted in the map industry, with rich map data sources, industry-leading automated production technology and mature process flow, laying a solid foundation for the future high-precision map production based on the fusion of multiple visual inertial navigation, which will further promote the development of autonomous driving.