I. Background and current situation

In recent years, the infrastructure construction of domestic road traffic and related facilities is changing with each passing day. The vast number of users have strong demand for daily travel, which puts forward higher requirements on the data quality and current situation of the electronic map products used. The traditional map data collection and production process, that is, manual processing of the collected data after the field collection by the collection equipment, has become increasingly prominent problems such as slow data update and high processing cost.

With its advantages in visual AI and big data technology, Amap leads the transformation of map data industry, and directly identifies and extracted all kinds of data elements from collected data through image AI technology, providing the most solid technical foundation for the realization of machine instead of human operation mode.

Scott map based on real world data acquisition of high frequency high density, using image visual AI ability, in the vast collection of library of automatic detection identification and the content of the sign and identify all kinds of traffic sign and marking location, again through the comparison with historical information, can quickly find the change of the real world information, combined with a strong and professional ability of data fusion at the same time, Achieve 100% information integration, so as to build a high current national basic map.

In conclusion, through the depth of the algorithm, maps, engineering and technical cooperation, as well as data acquisition, data production of business, to build a image recognition, location services, difference as the core technology such as filtering, data fusion based map data production of fully automated production line, to establish the terminal from the real world to map application, efficient and high quality data channel information production line.

Second, the feasibility and focus of automatic production line

From the perspective of the progress of image object classification and detection, image object classification and detection has a history of several decades, and a series of classical algorithms have emerged. In recent years, with the rapid development of image recognition technology, especially deep learning technology, and the development of GPU computing capability, classification and detection technology has been greatly improved.

From the perspective of big data required by automation, Autonavi has been dedicated to map data production for more than ten years and has accumulated rich and accurate data covering the whole country. In addition, it has a large amount of collected information every day, which has become a natural sample pool for algorithm training. At the same time, a set of professional and standardized map production norms have laid a solid theoretical foundation for data fusion.

Therefore, from the perspective of algorithm reserve capacity, data and process accumulation, automatic production line construction has strong feasibility, which focuses on the following four parts:

Image recognition: the goal of image recognition is to parse out the realistic information related to map data from the input image, and to subdivide its type by detecting and identifying the information of traffic signs and line signs in the picture, and understand the numbers and words in it, and express the content in the form of text. In addition, since the input is a continuous image, a single sign line sign can be observed on multiple images, so the same information in multiple images is integrated, and the most appropriate image is selected as the main image display.

Location service: Based on low-precision GPS and captured images, location service calculates the exact location of itself and objects in the scene and maps it to map data. It includes the core abilities of image road understanding, marker position analysis, acquisition track matching and so on. According to the track characteristics and road connectivity, the matching probability model of the relationship between the positioning position, Angle, speed and candidate road was established, and the track was associated with the map data. Through the understanding of the scene in many pictures, the relative position of the picture to the intersection is given, and the action position of the object is further determined by combining the shape of the road data on the map.

Image difference and semantic filtering: the purpose is to make a consistent comparison between the newly collected data and the data in the existing database, and automatically difference and filter the same information, leaving the changing information. The difference between the two is that the former is to detect whether there is any change in the newly acquired images at the same location relative to the historical images, and to compare the trajectory and the image itself. The latter looks at the content after image recognition from the perspective of data, and compares whether there is any change in the parent database data from the perspective of map semantics.

Location based data fusion: the result of image recognition, combined with the location provided by location services, to obtain the action path. Through the model of abstract intersection, data fusion is carried out on the road or intersection, that is, adding or updating map data.

Iii. Key technical capabilities

1. Image recognition

Image recognition is faced with three main challenges: on the one hand, there are various scenes and types. The objects to be detected are various, such as traffic signs, ground guidance lines, electronic eyes and so on. For example, the normal direction information sign is shown as follows:

To sum up, as far as the algorithm itself is concerned, traffic sign detection is actually a multi-type target detection task. The mainstream method is End2End scheme based on deep learning, which completes detection and subclassification tasks simultaneously in a network. Commonly used datasets are PASCAL VOC(class 20) and COCO(class 90), etc.

According to the actual needs of the business, the whole scheme is divided into two parts: target detection and fine classification. In the target detection stage, all traffic signs are detected in the picture through ftP-RCNN, which requires a very high recall rate and execution speed, and the requirements for accuracy can be relaxed accordingly. In the stage of fine classification, candidate frames are obtained in the stage of target detection, and then fine classification and noise filtering are carried out to ensure extremely high recall rate and accuracy.

. Location service

Track drift has always been a great challenge to the accuracy of position matching map. On the one hand, parallel roads and elevated roads, especially parallel roads with 1-2 lane distance between main and auxiliary roads, need high positioning accuracy. Conventional GPS positioning accuracy is 5-10m, so it is difficult to achieve the recognition rate of 80% of main and auxiliary roads. In addition, basic map data itself also has GPS accuracy problems.

In addition to the basic theories such as rule and hidden Markov model learning, inference and Viterbi algorithm, reasonable resistance to positioning drift is the key to the success of trajectory matching. By learning and summarizing the trajectory morphology, we can find out its rules, establish a probability model that conforms to its characteristics, accurately express the matching establishment process, and reasonably balance the relationship between matching accuracy and anti-drift ability. In addition, through the connectivity of long track and image recognition of lane number or road position relationship, the problem of parallel road scenes can be solved.

And the determination of the road and the role position and current depends on the image recognition for the intersection position of recognition and fusion understanding and judgment of map data scenarios, such as sign the relative position of road or crossing by recognition itself is very difficult to determine, need data fusion for data network features of understanding and judgment, the judgment is more complex, people understand it at a glance, But machines are hard to describe in terms of rules. Therefore, through the analysis of the scene of straight road section, straight road intersection and turning, the model of map road section or intersection is compared to determine the operation road, and the action position is calculated according to different attributes.

3. Image difference and semantic filtering

The main problem of image difference is data alignment, that is, multiple data collected at the same position will be affected by the accuracy of GPS itself and the judgment deviation of the road caused by the drift caused by the occlusion of satellite signal. In addition, in terms of semantic recognition, environmental factors in the natural environment, such as occlusion, blur, shadow, rain and snow weather, and Angle of view change, will affect the subsequent algorithm to parse the deep semantic information (such as type, content, etc.) of the image. The superposition of these two factors increases the difficulty in the consistency comparison of multiple images and semantics.

In this respect, the algorithm greatly and rapidly improves the accuracy of recognition and consistency judgment to avoid the influence of mismatching on data update. Image difference is divided into two parts: data alignment and local matching. Data alignment answers whether the two collected images are in the same position and Angle of view, etc., and judges the position relationship of two images by means of GPS track rough screening and image matching. Local matching needs to answer whether two objects are the same type. For objects with text content, it also needs to detect the consistency of layout and text. Therefore, in addition to introducing the common point feature matching technology, image matching network based on deep learning is also used. As for the text content, the OCR ability is used to complete the understanding and analysis of the content, and finally it is judged that the content collected twice is completely consistent.

4. Location-based data fusion

Due to the complexity of the real world, the accumulation of map production experience has formed a large number of standardized map data production norms, which are intangible assets that can reasonably abstract and accurately express the real world. Even though the real road network form is strange, it can be abstracted and classified through the model to establish a relatively general map data model in different scenes, so as to establish a large number of map data processing tools and methods on it, so as to ensure the wide use of automatic data fusion ability.

Four,

Scott SD base map data production automation implementation, essentially in the process of data base map production, the introduction of image AI technology and data fusion technology, combined with digital map production specification and experience for many years, innovation out a set of automatic production line for data, formation of automation to liberate human continue to provide efficient and high quality map data, In order to solve the map supplier production line of high degree of specialization, high labor cost, low efficiency of production line problems, and ultimately meet the majority of users travel process of electronic map product data demand of the current situation.