The background,

The mobile team of Manbang Group began to try ReactNative at the beginning of 2018. After nearly three years of development, it has carried most of the core business scenarios, involving 16+ business modules and 200+ pages, with average daily PV data of tens of millions. After the core business was developed with ReactNative, we broke away from the restrictions of APP release and used dynamic release uniformly. Compared with the APP version, the frequency of dynamic version is much higher, the minimum is two versions a week, sometimes even five versions a week.

ReactNative launched in 2018 with version 0.51, which was relatively new at the time. In subsequent releases, Facebook officially introduced a number of new features, such as Hooks, the Hermes engine, and more. We continued to use version 0.51, and these new features were unavailable, as were many of the third-party library businesses in the community based on the newer Version, ReactNative. Therefore, after three years of using version 0.51, we decided to upgrade to the current, newer version 0.62.

Ii. Main improvements of 0.62

2.1 Performance Improvement

The biggest improvement of 0.62 compared to 0.51 is the use of Hermes as the JS execution engine on Android, which has a significant improvement in startup speed, memory footprint and JS running efficiency.

2.2 Stability improvement

From version 0.51 to version 0.62, a number of functional and stability bugs have been fixed. For example, the robustness of the Native part of the SDK has been greatly enhanced, such as the ReactHostView in Android, the security of show() and hide() has been enhanced. Another example is the ViewManager section, where exception handling is performed directly when it is illegal.

2.3 Community Ecology

The ecology of ReactNative is mainly divided into two parts:

2.3.1 React language features.

**0.51** uses React 16.0, **0.6x** uses 16.11.+, and lots of exciting new features are added. For example, the 16.2.0 * * * * * * 16.8.0 Context (https://reactjs.org/docs/context.html), * * Hooks (https://reactjs.org/docs/hooks-intro.html), these is the most effective tools for development!Copy the code

2.3.2 Third-party libraries developed around ReactNative and React

Community third-party libraries tend to improve React dependencies on a biannual basis, such as the more well-known to navigation libraries:

React-navigation, and many useful new features, such as ReactNative internal routing stack starting to support activation between pages, back to background etc. This is very useful in our daily development.

2.4 Performance tracking

The above are the improvements officially provided by Facebook. There are major improvements in performance. In order to have a more quantitative understanding of the improved performance, we did a comprehensive performance test.

2.5 **** Performance on Android

Starting from 0.6x, the Hermes engine was introduced on Android, which brought a significant performance improvement. The biggest improvement of Hermes over JSC is the ability to directly run precompiled products of JS code, resulting in a significant increase in cold start performance and a decrease in memory footprint, but with a larger package size.

In order to understand the performance improvement data, we conducted a performance comparison test of JSC and Hermes on the Android side.

Test equipment VIVOX21RAM: 6G**. **

2.5.1**** Cold startup time Data

As you can see from the figure below, the cold start time of Hermes+HBC is more than 50% lower than that of JSC+JS. So we decided to use the Hermes+HBC solution.

2.5.2 Package size Data:

As can be seen from the figure below, the compression ratio of HBC binary package is significantly lower than that of Jsbundle, and its volume is almost twice that of Jsbundle. However, this can be avoided through subsequent unpacking and end-to-end conversion of HBC.

2.5.3 Code instruction processing speed

JSC performance deteriorates significantly when faced with a lot of computation and parsing, while Hermes is relatively stable. The time ratio between Hermes and JSC is about 1/6. The excellent processing speed greatly improves frame rate and animation smoothness.

2.5.4 Memory **** Occupied

Test equipment VIVOX21RAM: 6 g

Two conclusions can be drawn from the data measured in the figure below:

1. Memory performance of ReactNative 0.62 is significantly better than that of ReactNative 0.51, thanks to the loading mechanism of Hermes, which does not load the entire file into memory for parsing at one time.

2. ReactNative 0.62 has relatively smooth memory jitter, thanks to the fact that Hermes executes binary rather than JS code without the need for secondary transcoding.

The overall operation process, involving 4 ReactInstanceManager and 5 pages, saves 56 M of memory space, and benefits are indeed considerable.

2.6iOS **** performance

After upgrading from 0.51 to 0.62, the JS engine on IOS is still JSC only. However, outside of Jsbundle, support for RAM format, using RAM and inline scheme can improve cold start speed and memory greatly. However, considering that we will do pedestal splitting later, RAM format is not used, and JSC+Jsbundle scheme is still used on IOS. As a result, there is not much improvement in memory, cold boot, and command execution speed on IOS. However, the latest version of ReactNative, version 0.64, officially supports Hemers on IOS.

In terms of performance data, the Performance of The Android terminal has been greatly improved. The latest features of React, such as hooks, are also available, so we decided to upgrade to version 0.62.

Iii. How to perform a perceptionless upgrade

3.1 Challenges and risks

3.1.1 Multi-department cooperation and cooperation

As mentioned earlier, ReactNative hosts most of manbang’s core business scenarios, involving 16+ business modules, 200+ pages, and 50+ developers. The business of Manbang Group is in the period of rapid development, and all kinds of business operation activities are carried out on a daily basis. Large business, large staff, fast iteration pace, high stability requirements. It is necessary to coordinate the work of multiple testing, development and release teams.

3.1.2 SDK upgrade and high frequency release are parallel

To accommodate fast-paced business iterations, we release dynamic releases at least twice a week (up to five times a week). We require that technical changes should not affect business iteration (including APP version iteration and dynamic version iteration), and any business requirements should not be delayed because of technical changes. Therefore, we need the 0.51 release work and the 0.62 upgrade work to be synchronized and not interfere with each other.

3.1.3 Reducing upgrade Costs

Under the fast pace and high frequency of release, SDK upgrade should not bring too much burden to the development and testing of business requirements, and the impact on business development and testing should be reduced as much as possible. As a big version upgrade that will span 3 years, this upgrade involves a lot of Release notes. We need to try our best to accommodate these differences from the bottom, so as to reduce the modification surface of developers and regression strength of degraded testers as much as possible, and reduce costs in all aspects.

3.1.4 Ensure line stability

The average daily UV level of the two core apps of Manbang Group is 5 million, and the requirements for APP experience are very strict. If the abnormal rate increases by 1/10,000, the customer complaint rate will increase. Stability guarantee is the top priority of the upgrade plan. But no matter how perfect our plan is, there is no guarantee that something unexpected will not happen. Therefore, we need to detect online anomalies in the first place, reduce the impact and repair them in time.

This update to the ReactNative SDK is like changing a tire on a heavy truck going 120 yards.

3.2 Upgrade Scheme Principles

3.2.1 low risk

There are two main points:

1. Low business risk: it does not affect the iteration of business requirements. 2. Low stability risk: it does not affect the stability of the line, and the abnormal rate should be controlled at a very low level.Copy the code

3.2.1.1 Release scheme design

In order to meet the above two conditions, we decided to use batch, grayscale release online.

Batch is the process of dividing online users into batches, with one batch going online and then the others. There are four apps in Manbang: Yunmanman driver terminal, Van bangdriver terminal, Yunmanman cargo main terminal and Van Bangcargo main terminal. After analyzing the business characteristics, we adopt a scheme that two driver terminals are used in the first batch and two cargo main terminals are used in the second batch.

Gray scale is now used in the industry is very common, here no longer explain the meaning, below will be detailed on the details of gray scheme.

3.2.1.2 Alarm and Rollback Scheme Design

To be really low risk, we also need to nip online problems in the bud. We need an alarm mechanism. The full band ReactNative before the upgrade already has an alarm mechanism, so we only need to split 0.62 into a statistical dimension for separate calculation, because the amount of early gray scale is small, if it is reused with the original alarm mechanism, it is difficult to trigger the alarm condition.

For those online problems that cannot be solved within a short period of time, we also need to have a downgrade plan, which can switch from 0.62 online to 0.51 in a short period of time, and then cut back to 0.62 after the problem is solved.

3.2.2 Low cost:

Low cost here refers to the business development, testing impact as low as possible. Reduce the amount of code modification, modification difficulty, so as to reduce the labor cost of development investment; Reduce the scope of influence, so as to narrow the test regression range, reduce the regression strength, so as to save the labor cost of testing.

3.2.2.1 A set of code

In order to reduce risks, we use the form of multiple batches of gray volume release online, the whole online cycle will last for a long period of time, during the online, each business module is constantly developing new requirements iteratively. In other words, the existing business code and the business code for the new requirements should be compatible with both versions of the SDK. The simplest solution is to maintain two sets of code, one for each VERSION of the SDK, but this requires writing the code twice, which is a heavy burden for business development. To avoid this burden, we came up with a solution that accommodates both versions of the SDK.

3.2.2.2 Switching the development environment

One set of code fits both SDKS, and of course the code has to be on one branch. When developing business requirements, it is necessary to run the code on two VERSIONS of SDK environment respectively. We provide the environment switching script, which can switch to different ReactNative environment with one command. For example, grayscale has been carried out on the driver side online, while the consignor side has not yet started to scale. For the codes that need to run at both ends of the driver and consignor, developers can switch to different environments for development through scripts, as shown below:

3.2.2.3 Code modification scan

To further reduce the cost of adaptation for developers, we have developed a special scripting tool that scans out all the changes that need to be made and shows how to make them.

By adopting the above scheme, we achieved complete control of upgrade risk (stability control through multiple batches of grayscale upgrade), and minimized the adaptation cost of developers (adaptation of two versions of SDK and script scanning and modification through one set of code).

Iv. Technical preparation

4.1 API Changes Comb

Before upgrading, you need to comb through the API differences between the two SDK versions and have a thorough understanding of all changes from 0.51 to 0.62. API change is divided into two types:

  1. breaking change
  2. The product change

Our method is to read all versions of Release notes from 0.51 to 0.62, sort out all breaking changes, and make special adaptation schemes for each breaking change. For example, AsyncStorage 51 uses XXX and 0.62 uses YYY, so the code of 0.51 and 0.62 are incompatible with each other. Our adaptation solution is to use our own encapsulated Bridge[MBbride.app.storage].

//npm install --save @react-native-community/asyncstorageNot recommended// import AsyncStorage from '@react-native-async-storage/async-storage';
 
// Bridge is recommended
// Get VALUE based on KEY
MBBridge.app.storage.getItem({ key: BootPageModalKey.KEY_IS_SHOW_BOOTPAGEMODAL }).then(res= > {
  if (this.isGuidanceSwitch(res? .data? .text)) { retuReactNativenull}})/ / store < KEY, VALUE >
MBBridge.app.storage.setItem({ key: Constant.StorageKey.Common.RefeReactNativeame, text: commonStore.refeReactNativeame })
Copy the code

4.2 Code adaptation scheme

With three or more iterations a week and the ReactNative stack being used, it would be too expensive to synchronize two sets of code (0.51&&0.62) at such a fast development pace. So we decided to use one set of code that would be compatible with both 0.51 and 0.62 ** : ** For all incompatible apis, encapsulate an adaptation layer and mask the underlying differences. As shown below:

For example, the adaptation of the navigation library is as follows:

Modify before:

This is our business layer

import { StackNavigator } from "native-navigation"
const RootStack = StackNavigator(...)
export default class xxxx extends Component<any.any> {
  render() {
    retuReactNative (
      <RootStack screenProps={this.props} />)}}Copy the code

Revised:

ReactNative**-lib-protocal** is our protocol layer

import { createStackNavigatorCompat, createAppContainerCompat } from "@ymm/ReactNative-lib-protocal"
const RootStack = createStackNavigatorCompat(...)
export default class StickerPageRouter extends Component<any.any> {
  render() {
    const App = createAppContainerCompat(RootStack)
    retuReactNative (
      <App screenProps={this.props} />)}}Copy the code

Here is the protocol implementation layer of 0.62:

import { NavigationActions } from 'react-navigation';

export default class StackActionsCompat {

    static reset(resetAction: any){
        retuReactNative NavigationActions.reset(resetAction)
    }
    
    static push(pushAction: any) {
        retuReactNative NavigationActions.push(pushAction)
    }
    
    static pop(popAction: any) {
        retuReactNative NavigationActions.pop(popAction)
    }
    
    static popToPop() {
        retuReactNative NavigationActions.popToTop()
    }
}
Copy the code

In this way, business development students can realize a set of code running on two Versions of ReactNative, saving the cost of maintaining two sets of code.

4.3 Script Tool

The tools here include three:

1, API inspection tool (support local && CI/CD);

2. Code engineering environment switching tool;

3. Run the environment check tool.

4.3.1APICheck tools

The API checker is designed to check for those apis that run in 0.51 but are no longer compatible with 0.62. To solve this problem, we abstracted the rules of API checker for the various changes between the two versions. The check tool is written with Python script. Developers can check locally (directly run Python script or run NPM command) or enable the check when Jekins is packaged. The check effect is as follows:

Checking:

Abnormal check:

Pass the inspection:

4.3.2 Environment Switching Tool

The engineering environment switching tool is designed to enable developers to easily switch between 0.51 and 0.62 protocol implementation layers and ** configuration files (package.json, metro.config.js, etc.). ** can be implemented by Shell or Python.

This tool ensures that business development students can work on a branch without focusing on APi differences and configuration differences between 0.51 and 0.62.

4.3.3 Environment Check Tool

For example, the 0.51 native SDK loaded 0.62 Bundle/HBC, and the 0.51 native SDK loaded 0.62 Bundle/HBC. Or the 0.62 native SDK loaded a 0.51 Bundle to avoid unnecessary hassle and communication costs:

5. Release scheme

Below is a schematic of our upgrade plan. The process is divided into four main lines based on roles: developer, tester, APP version, and dynamic version. The timeline for each main line has detailed ** actions at key points in time. ** For example: for business developers (line 1), the compatible business code needs to be merged into the dynamic-1231 main release branch on December 18, 2020-1231, followed by 0.51, 0.62 common code until the end of the upgrade process.

5.1. Upgrade in batches

As mentioned above, we adopt a batch upgrade scheme, with the driver end APP being launched in the first batch and the cargo main end APP in the second batch.

Android side for this upgrade, ReactNative environment has been plug-in. In order to minimize the risk, the first batch of Android driver launch was carried out through the form of plug-in dynamic release: version 0.62 SDK and HBC products were delivered to the end through dynamic upgrade. Dynamic publishing allows very flexible greyscale pacing: to ensure stability, we can pull the greyscale time long enough. Also, our dynamic upgrade platform supports online real-time rollback.

From the perspective of stability, we decided to launch the Android terminal through dynamic upgrade. However, while ensuring stability, it cannot affect the launch of business requirements. When the 0.62 VERSION SDK and HBC products are released in gray scale online, business requirements will also be released synchronously based on the 0.51 version SDK and Jsbundle products. That is: 0.51 and 0.62 environments need to exist in parallel online for a long time.

One important aspect of the grayscale process is the synchronization of the online environment: 0.51 and 0.62 products are released online at their own pace, without interfering with each other, but at the same time must contain all business requirements.

For example, 0.51 has a version every two days, while 0.62 grayscale cycle is 10 days. Therefore, it is necessary to ensure that users should include the latest functions regardless of whether they are using 0.62 or 0.51. Our strategy is as follows:

As shown in the figure above, the 0.51 and 0.62 releases are parallel lines, with the 0.62 release being designed to be larger than the 0.51 release **(ensuring that the 0.62 product is never overwritten by the 0.51 product)**. Each release of the 0.51 business pack releases a 0.62 business pack at the same time. Therefore, the following two points can be guaranteed:

1. The functions used by online users are always up-to-date;

2. The 0.62 product always grayscale according to its own rhythm and will not be covered by the 0.51 product.

To be 0.62 gray scale after the full, online will not be released 0.51 business package, online upgrade switch completed!

5.2, CI/CD

Due to the fact that 0.51 and 0.62 business bundles need to exist online in parallel for a long time, and the two versions of the environment, there will be incompatibilities in the artifacts. Therefore, in addition to the means of environmental inspection in the test phase, we also need to insert a series of verification processes in the CI/CD phase:

  1. Environment switch;
  2. Integrate Python scripts that check for compatibility with the API into the build process;
  3. Generate version number rules based on product type:

Rule for version number 0.51:5.91.XXX.YY

Rule 0.62 Version: 5.91.1 XXX.YYYY

  1. Generate additional maps for HBC artifacts for Android and upload them to FTP.

5.3. Data preparation

Here is mainly a buried point to distinguish it from the data of 0.51. We expect the data generated by online 0.62 to accurately reflect the real situation of the upgrade (access ratio, stability). Meanwhile, we have separately configured alarm policies for the data of 0.62:

Indicator/Platform Android IOS
Container access .scenario(“enterPage”)

.params(“is62”, true)
scenario:@”enterPage”

extraDic:@{@”is62″:@(true),}
Abnormal statistics .scenario(“jserror”)

.params(“is62”, true)
scenario:@”jserror”

extraDic:@{@”is62″:@(true),}
Engine Creation Time .scenario(“engine_cost”)

.params(“is62”, true)
scenario:@”engine_cost”

extraDic:@{@”is62″:@(true),}
Cold start time .scenario(“cold_lunch”)

.params(“is62”, true)
scenario:@”cold_lunch”

extraDic:@{@”is62″:@(true),}

Six-line verification

After the project was launched, all we had to do was timely follow up the online data, verify the preliminary laboratory data, pay attention to the monitoring data and adjust the plan in time.

6.1 Daily Report Output

During the gray level of 62 upgrade package, there will be daily report output, including DAU, PV, JS abnormal users, JS abnormal rate, SDK abnormal users and SDK abnormal rate of each module, so that the development and testing students can have an overall understanding of the online operation status. We also made a pre-plan in advance to stop gray scale when the abnormal rate reaches a certain threshold.Copy the code

6.2 Output of Performance Data

Taking the performance data of Android terminal as an example, the performance data collected online is as follows, which is basically consistent with the data measured offline:

indicators business Packet size Cold start time Hot start time Memory footprint
Before the upgrade The delivery page 5.2 M 2115ms 48ms 30M
audit 2.4 M 1155ms 41ms 25M
. . . . .
After the upgrade The delivery page 7.1 M 748ms 49ms 21M
audit 3.8 M 552ms 43ms 16M
. . . . .
ascension 45% About 64% There is no change About 30%

1. Package size

On the Android side, Hermes+HBC is used to package the output from string **.jsbundle to binary. HBC **, which increases the package size by more than 45%. This is a space-for-time optimization (JIT to AOT).

2, cold start

After the Hermes+HBC solution is adopted, the command running speed is greatly improved, the cold start time is reduced by about 64%, and the start speed is increased by nearly three times, which is basically in line with our previous tests and expectations.

3. Hot start

We made the engine reuse mechanism, after the engine is created once, it will reside in memory, so the second startup is hot. Compared with cold startup, hot startup does not require time-consuming operations such as JS code loading and initialization execution. As a result, the hot start time has barely improved, which is basically in line with our previous tests and expectations.

4. Memory usage

The Hermes engine performs HBC, eliminating JS code interpretation, resulting in a more than 30% reduction in cold start single page runtime memory.

6.3 Subsequent Batches

The first batch of upgrade will basically come to an end, during which many best practices will be accumulated: on-line plan, fault tolerance scheme, test plan, performance analysis, etc. The upgrade of the main end of the second batch only needs to make slight adjustments on this basis, and the on-line risk and overall plan will be much smoother.Copy the code

In the first batch of stability tests at the driver’s end, we can confirm that the overall risk is under control. Therefore, when the second batch of consignors went online, they went online directly by following the release of APP.

There is no plug-in mechanism on IOS terminal, so the two batches are launched following the release of APP, using the default 7-day gray scale of AppStore.

Seven, end

So far, the whole upgrade work come to an end. Technical pre-study, performance examination, the project from the beginning to specify, upgrade test again at the end of the online verification, is essential to every link, the entire process to us across the team inspired not only in technology, but also on the communication and task management, we are a technical personnel, management personnel, risk control, product personnel. In the end, through the analysis of online data, we will constantly adjust our upgrade strategy and detailed parameters, which is also a process to verify the laboratory conjecture. The whole process is not guaranteed to be smooth, but as long as every step has a relatively sound Plan B and team members work closely together, the upgrade work is not as difficult as expected.


Author’s brief introduction

Chao Wang, currently the front-end team architect of Manbang, is responsible for the construction of ReactNative platform of Manbang.