Editor’s note: Electron is arguably one of the preferred frameworks for front-end engineers to develop desktop applications, but for front-end engineers, the complexity and quality and stability challenges we face are even greater. In this sharing session, Mr. Xu Nan, front-end engineer of Ant Group, will talk about how to guarantee the quality and stability of Electron application through engineering means based on the development process and challenges faced by the desktop of Electron. Welcome to enjoy.

Speech video address: player.bilibili.com/player.html…

Hello, everyone. My name is Sunan. I am now in charge of the research and development of the Language Finch desktop. Today, I would like to share with you some quality practices of the whisperer on the desktop.

My share today is divided into three parts. First, what is a finch? Second, why did we choose Electron to develop a desktop application? The third is our main content today, language finch desktop to do some quality and stability of the program.

What is a whisperer

What is a whisperer? In short, Wordfinch is a professional knowledge base in the cloud. We provide easy to use knowledge management and quiet and pleasant team work ability.

What is the role of the table end in the finch? During the development of Wordfinch, we found that there are a lot of users of PERSONAL knowledge management on Wordfinch. We hope to provide these users with better personal note-taking tools and a cleaner creative experience through the system capabilities of the terminal.

Why Electron

Since the mention of the desktop end, we have to consider how to do technology selection, with what framework to develop.

So why did I choose Electron? First, let’s talk about the technical background of the team. Finch is also a full stack team for Web technologies. Our editor and many of our underlying business capabilities are based on THE JS language. Secondly, we were still a very small entrepreneurial team at that time. We have a very strong cross-end appeal and want to be able to run a set of code on multiple ends. In the end we want the framework we choose to have a mature product to back us up and we don’t need to be involved in its underlying development.

At this point we’ll find Electron is a great choice. First of all, it is based on the Chromium kernel, we can use JS to develop our pages; Second, it can do network and file operations through Node.js, which most front-end engineers are familiar with. It also provides a very rich system interaction API. Finally, we all know that some of the most well-known domestic and international products, including VS Code, are built on Electron.

The overall architecture of Language Finch desktop is probably like this: First, we build our infrastructure based on cloud services; Secondly, we build a lot of cross-end capabilities on top of it; The next level up is the capabilities of our desktop applications, such as our GUI management, publishing tools, r&d support, and so on.

But the main point today is to share some of the quality and stability challenges that front-end engineers face when developing Electron applications. First of all, we all know that the development side of the application has a very important characteristic, that is, its release frequency is lower, and we have a lot of Web services to integrate, the code quality will be higher requirements; Second, with the interaction between the renderer process and the main process, as well as a lot of system interaction, there are longer links to troubleshoot problems; Finally, because the package is on the user’s system, we also face more security requirements.

Quality and stability exploration

So how do we solve these problems? First, take a look at the research and development process of finch. Among them, the development students will pay more attention to the stages of R&D, testing, release and monitoring. Unlike many teams, we do not have a full-time testing team. Therefore, the development engineer of the finch is the number one position of quality and stability.

What are the main problems we face at these stages? For example, in the development phase, we focus more on code quality; In the test phase, we pay more attention to whether the coverage of the functions we develop is enough in the integration test. The release phase is about our package’s reach rate and delivery stability; And finally, the ability to find problems online.

What are our engineering tools for these different stages? For example, for code quality, we might pass unit tests; Test coverage may provide incremental coverage during testing; For contact rate and stability, there are dynamic updates and grayscale capabilities, which I will talk about later based on some specific scenarios.

Unit testing

Unit testing. When you talk about unit testing, the first thing you might think about is that it’s too expensive to code, you might write 20 lines of code, 40 lines of test; The second is that the effect does not last, right? Some students will find that the single test does not bring great effect for a period of time, and then slowly give up; Finally, we have different styles and poor maintainability.

So how do we solve this problem? First, how to minimize the cost of writing a single test. The first is to provide a large number of data constructors for our business model, which can quickly initialize our business data.

The second thing we know is that UI testing is very dependent on user actions, like you simulate the user clicking on a Button and triggering a state change. We will provide a component rendering initialization capability. You can quickly construct props so that you can directly construct an intermediate state to validate UI consistency without requiring complex user operations.

Third, in order to make the monometry more cohesive. We support mock all system dependencies. For example, IPC calls, network calls, etc., so that you only need to focus on the execution results of the local single test.

Finally, it is based on the rich assertion capabilities provided by Jest to do the judgment of the final state. Through these steps we basically standardized the single test step by step. You construct the data, you simulate the rendering of the entire UI, and then you make UI assertions; This can ensure that the entire team’s single test cost will be gradually reduced. Second, everyone will have the same code style.

The next step is how to ensure our single test effect. First, we provided coverage points and test reports during the CI session after the code was submitted. Everyone can see the coverage of the single test.

Secondly, we provide incremental coverage results submitted for some modules with low coverage per test, to help you find gaps and gradually improve coverage.

Integration test coverage

In order to improve software quality, it is far from enough to rely on single test. As we all know, high coverage of single test does not mean that there is no problem with our integration test. So how can we cover the code we submitted this time as much as possible during the development and integration test?

Now that we’re talking about coverage, let’s go back and look at what we did with coverage on one side. The Istanbul plan is actually a very standard one. It does the AST parsing of the code, and then generates a portion of the pinned code, and it marks which part of the code is state and which part of the code is function, and then it stakes it. Coverage results are available when the execution is complete.

Based on this idea, we tried integrating Istanbul into the test suite, generating the code to insert during the test, so that when we run the test cases and run the integration test, we get the code coverage for this test, which is a little bit closer to the incremental coverage we wanted.

Git Diff can get row and row information. Now that we have test coverage, we can compare Git Diff’s run-time coverage with Git Diff’s row and row information to get run-time coverage.

A simple diagram for you to look at (take a demo, we will not see the actual code). First of all, you can see that the grayed part is the history code, and the highlighted part is the incremental code we committed this time. Those highlighted in red are the areas that the increment code did not cover during execution. In the process of integration testing, we can use such a coverage report to roughly evaluate what features I did not test this time. Whether to continue integration testing.

Our code quality assurance is basically the above two aspects. Through single test to ensure our coding quality, through integrated test coverage to ensure the coverage of our testing process.

Stability thinking

The first part is code quality, and the second part is how we deal with quality online and stability on the side. I think stability on the desktop side can be divided into four areas.

  1. Package security, part of which is Web security to face, such as XSS vulnerabilities; Secondly, how can the package we deliver be tamper-proof? Users must get the package we provide. Finally, some notarization and signature of the installation package are required by Apple and Windows system security.
  2. Data security capabilities because we do a lot offline. How to do local data storage, how to prevent the loss of offline data, users should not lose the previous content when offline to online, this is our bottom line.
  3. Application of update capabilities, such as dynamic update, as far as possible to reduce the number of user updates, how to do gray, how to rollback.
  4. Logs and monitoring, how do we find exceptions through logs, help us locate problems as quickly as possible.

Due to time constraints, today we will just pick one piece and talk about our log monitoring and how the full link log is done.

Full link log

What are our main requirements when doing end logging?

  1. Because the operation link of the end is very long, involving the process communication on the end and the operation of the local database, we need to collect operation logs as much as possible and ensure full link tracing.
  2. Security requirements will be higher to avoid sensitive data leakage.

What are the main things we do in daily collection?

  1. The first thing we do after the request from the renderer is generate a Trace ID based on the timestamp and device information to help us do the log link tracing.
  2. When the request gets to the main process, we do database operations, network requests, and some business operations, and that logs. One of the most important things to do here is desensitize our logs, because our logs will fall off. Therefore, we do not store sensitive information such as the content of users’ documents, titles and so on.
  3. Once we have these logs, we do two things. The first one is log output, like database operation will be very much, we compress and integrate to do batch report, to reduce the pressure on our server. The second thing is because there are some offline situations, so we also do some local storage. Local storage refers to the server’s log model, which is distinguished by type and scrolled by date for easy traceability

In addition to some daily reports on the end, we have the entire end-to-end log link. After reporting on the desktop, it is first sent to the server end of our Yuqiyun service, and then we will record a log on the server end, and then perform analysis and integration based on the SLS of Ali Cloud. For example, in SLS, we will do some online market analysis and alarm. For some data that need long-term analysis, ODPS storage is used for long-term analysis.

What are the types of logging that we care about at the front end? The common ones are unHandlerejection and TypeError. There is something special about the language, because when it comes to the editor, we will do Webview isolation between the editor and the main application. We will pay much attention to the Crash of the editor, so we will make Crash report of the compiler based on the React componentDidCatch.

After reporting, you can see the error message on the ONLINE SLS, and do some analysis regularly.

There is also some business analysis, each time the release of the core data will be concerned about, such as very core document loading failure, document saving failure these situations.

Future exploration

And that’s all we have time for today. We are also exploring automated UI testing in the future to reduce our application test regression costs.

And here’s a video of what we’re trying to do. First of all, unlike many teams, we believe that the end test needs a more stable user runtime environment, so we try to implant scripts in the test package to ensure the execution of the real environment. Second, we will make some network mocks to simulate the special situation of editing and saving in the case of weak network and disconnection. Finally, we will also have some test machines to do some long-term timed runs to help us gain insight into performance. For example, if the user uses it for a long time, there will be memory leaks and CPU abnormalities.

conclusion

Finally, I’d like to share some of my thoughts on quality and stability during my time working on desktop applications.

  • Development is the quality of the number one, we can through engineering means as early as possible to avoid some quality problems.
  • Quality is a matter of gradual enhancement and persistence. Whether it is single test or logging and monitoring, it is a long-term investment, and it is difficult to make a profit in one step
  • There is no silver bullet for quality. We need to find a quality direction that is more suitable for our business in our own business and team situation.

That’s all for today. Thanks for watching