The following content is reprinted from duoduo Loai learning article “WebAR technology exploration – application in navigation” author: Duoduo Loai learning link: juejin.cn/post/684490… The copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.


This paper explores the cutting-edge technology and difficulties in realizing AR navigation effect in Web front end.

1. Introduction of AR

Augmented Reality, or AR, is a technology that calculates the position and Angle of a camera image in real time and adds images, video, and 3D models. The goal is to overlay the virtual world on the screen and interact with the real world.

In general, the main steps to achieve AR effect on the Web are as follows:

  1. Obtaining video Sources
  2. Identify the marker
  3. Stacking virtual objects
  4. Show the final screen

What is special about AR navigation is that it does not identify marker to determine the superposition of virtual objects, but connects virtual and reality through positioning. The main steps are as follows:

  1. Obtaining video Sources
  2. Coordinate system transformation:
    1. Gets the absolute location of the device and path
    2. Calculate the relative positioning between each marker point and the device in the path
    3. Draws marker points in the device coordinate system
  3. The 3D image is superimposed on the video
  4. Update positioning and device orientation to control camera movement in three.js

2. Technical difficulties

As mentioned above, the main steps of AR navigation are as follows:

  1. Compatibility issues
  2. WebGL 3d mapping
  3. Positioning accuracy and trajectory optimization
  4. Mapping of virtual and real unit scales

2.1 Compatibility Problems:

Compatibility problems caused by different operating systems and browsers on different devices are mainly reflected in the support for obtaining video streams and gyroscope information on devices.

2.1.1 Obtaining the Video stream

  1. Navigator API compatibility handling

    The navigator. GetUserMedia () is not recommended, the new standard adopts the navigator. MediaDevices. GetUserMedia (). However, different browsers have different levels of support for the new method, which requires judgment and processing. Also, if you use the old method, the method name will be different in different browsers, such as webkitGetUserMedia.

    // mediaDevices attributes are not supportedif(navigator.mediaDevices === undefined) { navigator.mediaDevices = {}; } / / does not support mediaDevices getUserMediaif (navigator.mediaDevices.getUserMedia === undefined) {
    	navigator.mediaDevices.getUserMedia = function(constraints) {
    		var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;
    
    		if(! getUserMedia) {return Promise.reject(new Error('getUserMedia is not implemented in this browser'));
    		}
    
    		return new Promise(function(resolve, reject) { getUserMedia.call(navigator, constraints, resolve, reject); }); }} Copy the codeCopy the code
  2. Parameter compatibility processing

    GetUserMedia receives a parameter of type MediaStreamConstraints that contains two members, video and audio.

    var constraints = {
    	audio: true,
    	video: {
    		width: { 
    			min: 1024,
    			ideal: 1280,
    			max: 1920
    		},
    		height: 720,
    		frameRate: {
    			ideal: 10,
    			max: 15
    		},
    		facingMode: "user"// user/environment, set the front and rear cameras}} copy codeCopy the code

    The facingMode parameter is only partially supported by Firefox and Chrome. For other browsers (wechat, Mobile Q, QQ browsers), another parameter is required: optional.sourceId, which is passed in as the ID of the device’s media source. After testing, the method shows differences on wechat and mobile Q of different devices with different version numbers.

    if(MediaStreamTrack.getSources) {
    	MediaStreamTrack.getSources(function (sourceInfos) {
    		for(var i = 0; i ! =sourceInfos.length; ++i) {
    			var sourceInfo = sourceInfos[i]; // Audio and video are iterated over here, so make a distinctionif (sourceInfo.kind === 'video') {  
    				exArray.push(sourceInfo.id);  
    			}  
    		}
    		constraints = { 
    			video: {  
    				optional: [{  
    					sourceId: exArray[1] //0 indicates the front camera, and 1 indicates the rear camera}]}}; }); }else {
    	constraints = { 
    		video: {
    			facingMode: {
    				exact: 'environment'}}}); } Duplicate codeCopy the code
  3. The operating system is incompatible

    GetUserMedia () is not supported by any browser on iOS devices due to Apple’s security issues. Therefore, WebAR navigation cannot be implemented on iOS system.

  4. agreement

    For security reasons, Chrome47 will only support video sources from HTTPS pages.

2.1.2 Obtaining the rotation Angle of the device

The rotation Angle of the device represents the user’s perspective and is an important parameter that connects the virtual and real world. HTML5 provides the DeviceOrientation API to capture the device’s rotation Angle parameters in real time. Return the DeviceOrientationEvent object by listening for the DeviceOrientation event.

{absolute: [Boolean] Whether absolute rotation value alpha: [0-360] beta: [-180-180] gamma: [-90-90]} Copy codeCopy the code

Alpha, beta and gamma are the angles we want to obtain. Please refer to the figure below and the reference article for their respective meanings:

The basics of gyroscopes

However, in the WebKit kernel browser of iOS, this object also includes the webkitCompassHeading member, whose value is the deviation Angle of the device from due north. Also, in iOS browsers, alpha is not an absolute Angle, but the Angle at which you start listening for events is zero.

On Android, we can use -alpha to get the Angle of the device due north, but this value is not stable in our tests so far. So in the test Demo, we added a process of manually correcting the alpha value, pointing the device due north to get absolute 0 degrees before the navigation started, which was not rigorous but worked well.

2.2 WebGL 3d mapping

WebGL is a set of specifications for realizing 3D effects in browsers. AR navigation needs to draw markers with different distances and angles, so 3D effects are needed to adapt to the video stream of real scenes. However, WebGL native interface is very complex. Three.js is a library based on WebGL, which simplifies the encapsulation of some native methods, making it more convenient for us to program.

There are Three main concepts in three.js:

  1. Scene: container of object, we want to draw the marker point is to add the specified coordinates and size of the sphere in the scene
  2. Camera: It simulates human eyes to determine which Angle and part of the scene to present. In AR navigation, we mainly simulate the movement and rotation of equipment through the movement and rotation of the camera
  3. Renderer: Sets up the canvas to render the scene captured by the camera on a Web page

In the AR navigation code, I encapsulate the creation of three.js by passing in DOM elements (usually

, as containers) and parameters to automatically create Three components. Interface methods such as three. addObject and three. renderThree are provided to add/remove objects or update renderings in a scene.
functionThree(cSelector, options) { var container = document.querySelector(cSelector); Var scene = new three.scene (); PerspectiveCamera = new THREE.PerspectiveCamera(options.camera. Fov, options.camera. Aspect, options.camera. options.camera.far); Var renderer = new THREE.WebGLRenderer({alpha:true}); / / set the camera rotation controller var oriControls = new THREE. DeviceOrientationControls (camera); // Set the scene size and add it to the page renderer.setSize(container. ClientWidth, container. The renderer. SetClearColor (0 XFFFFFF, 0.0); container.appendChild(renderer.domElement); This. main = {scene: scene, camera: camera, renderer: renderer, oriControls: oriControls, } this.objects = []; this.options = options; } Three.prototype.addObject =function(type, options) {... } // Add objects to the scene,typeSupport the sphere/cube/cone Three. Prototype. PopObject =function() {... } / / delete objects in the scene. Three prototype. SetCameraPos =function(position) {... } / / set the camera position Three. Prototype. RenderThree =function(render) {... } / / rendering update, render to the callback function. Three prototype. SetAlphaOffset =function(offset) {.. } // Set the offset Angle to correct alpha copy codeCopy the code

In the control on the rotation of the camera, I use the DeviceOrientationControls, it is Three js official follow equipment rotating camera controller, Realize the monitoring of Deviceorientation and euler Angle processing of DeviceOrientationEvent, and control the camera rotation Angle. Just call the update method when rendering the update:

three.renderThree(function(objects, main) {
    animate();
    function animate() { window.requestAnimationFrame(animate); main.oriControls.update(); main.renderer.render(main.scene, main.camera); }}); Copy the codeCopy the code

2.3 Positioning accuracy and trajectory optimization

In our survey, there are currently three solutions for obtaining positioning: native Navigator. geolocation interface, Tencent front-end positioning component, and wechat JS-SDK geographic location interface:

  1. The native interface

    The Navigator. geolocation interface provides the getCurrentPosition and watchPosition methods to get the current position and listen for position changes. After testing, the watchPosition update frequency in Android system is low, while the watchPosition update frequency in iOS system is high, but the jitter is serious.

  2. Front-end positioning assembly

    Use front end positioning components need to introduce JS module (https://3gimg.qq.com/lightmap/components/geolocation/geolocation.min.js), Qq.maps.Geolocation(key, referer) is used to construct objects. GetLocation and watchPosition are also provided. After testing, in X5 kernel browser (including wechat, mobile Q), the positioning component is more accurate than the native interface positioning, and the update frequency is higher.

  3. Wechat JS-SDK geolocation interface

    Using wechat JS-SDK interface, we can call indoor positioning to achieve higher accuracy, but we need to bind the public account, which can only be used in wechat, and only provide getLocation method, temporarily not considered.

    To sum up, we mainly consider the implementation in X5 kernel browser, so we choose Tencent front-end positioning components to obtain positioning. However, the problem of inaccurate positioning was still exposed in the test:

    1. Inaccurate positioning leads to the imprecise superposition of virtual objects and reality
    2. The jitter of positioning leads to the following jitter of virtual markers, and the visual effect of movement is not smooth enough

To solve this problem, I designed a method to optimize the trajectory, such as locating and denoising, determining the initial center point and adsorbing according to the path, so as to achieve a more stable and accurate change effect when moving.

2.3.1 Location denoising

The location data we get from the getLocation and watchPosition methods contains the following information:

{accuracy: 65, LAT: 39.98333, LNG: 116.30133... } Duplicate codeCopy the code

Accuracy indicates the positioning accuracy. The lower the value is, the more accurate the positioning. It is assumed that the positioning accuracy on the fixed device follows normal distribution (positive skewness distribution to be exact), and the mean and standard deviation stdev of the positioning accuracy of the whole track points are counted, and the points in the track whose positioning accuracy is greater than Mean + (1~2) * stdev are filtered out. Or use box graph method to remove noise points.

2.3.2 Determination of initial point

The initial point is very important. If the initial point deviates, the path is not accurate, the virtual reality cannot overlap, and the correct movement path cannot be obtained. During the test, I found that most of the anchor points obtained at the beginning of positioning were not accurate, so it took some time to determine the initial point.

Start positioning, set N seconds to get initial positioning. N seconds after the location is denoised to form a sequence track_denoise = [loc0, loc1, loc2…] , calculate the sum of the distance between each point in the sequence and other points, and add its own positioning accuracy to get a central measure value, and then take the point with the smallest measure value as the starting point.

2.3.3 Location correction based on route

Based on the assumption that the device always follows the planned route, anchor points can be attached to the planned route to prevent 3D image jitter.

As shown in the figure below, the mapping point from the anchor point to the line segment is taken as the correction point. Route segment selection is based on the following:

  1. Initial state: The line segment between the starting point and the second route point is the current line segment,cur = 0; P_cur = P[cur];
  2. In the firstNWhen moving up the line segment, if the mapping length (the distance between the mapping point and the starting point of the line segment) is negative, the correction point takes the starting point of the current line segment, and the line reverts to the upper line segment.cur = N - 1; P_cur = P[cur];; If the mapping length is greater than the length of line segment, the correction point takes the end point of the current line segment, and the line advances to the next line segment.cur = N + 1; P_cur = P[cur];
  3. If the effective range of the current line segment and the next line segment overlaps (the green shaded area in the figure below), it is necessary to judge the distance between the anchor point and the two line segments, whichever is shorter, to determine the correction point and line selection.

2.4 Virtual and real unit length Mapping

There is no definite mapping between unit length in WebGL and unit length in the real world, so it cannot be accurately mapped for the time being. After testing, choose 1 (m) : 15 (WebGL unit length).

3. The demo presentation

Demo video: WebAR technology exploration – Application in navigation

Developers interested in maps, welcome to log in Tencent location service experience ~