2020 is destined to be an extraordinary year. In this special year, bytedance’s technical team is still around to share bytedance’s technical practices. The Bytedance technology team published 50 articles this year, many of which were well received by readers. On this New Year’s Day, we’ve selected 10 of our favorite articles for you to review, and click on the title to read the full article.

TOP 1: Summary of Bytedance chaos Engineering practice

# Infrastructure # Chaos Engineering

Chaos engineering helps the system to find weak points by means of fault injection to improve the stability of the system. With the development of micro-services and cloud native technologies, distributed systems have become popular throughout the industry, but they also bring challenges such as a sharp increase in complexity, difficult to predict the consequences of failure, and difficult to avoid and verify. Chaos engineering helps solve the above problems through fault injection. This paper discusses the relevant practices of bytedance since its introduction into chaos engineering, hoping to provide some references.

TOP 2: Bytedance’s own Quadrillion graph database & Graph computing practices

# Infrastructure # Graph database #ByteGraph

In 2019, Gartner listed graph as one of the top 10 data and analytics trends of 2019, and ByteDance has also made extensive use of graph technology in the face of the business challenge of recommending massive amounts of content to massive numbers of users. This article will analyze and share the distributed graph database and graph computing engine developed by ByteDance in depth, and show how the new technology solves business problems and affects the product experience of hundreds of millions of Internet users.

TOP 3: Toutiao Quality Optimization – Graphic details page second practice

# Client # Performance optimization

As a content class applications, has been watching the news read information is the core of the headlines for the user demand, page open speed is directly related to the user use the headline at the core of the experience, in the headlines, more bearing rich enough to keep multiterminal experience under the style and logic of unity, details the content of the page we are carrying through the WebView, However, the performance of WebView itself is relatively poor compared with Native. Therefore, toutiao technical team has been working on optimizing the loading speed of detail pages.

After continuous optimization, the opening experience of details page in Toutiao online is basically invisible to the naked eye. In the following article, we will gradually break down and introduce our ideas and practices for detail page load optimization.

TOP 4: Online accidents caused by “���

#Go # server

This paper reviews and summarizes an online accident that relies on upgrade + abnormal data. The following reflections are drawn:

  1. Service major version updates, at least one week offline run.
  2. If you have a problem, roll back at the first time.
  3. The use of tools should be standardized. For example, do not change the content of vendor folder without synchronous updatevendor.jsonFiles and vice versa.
  4. becausego modVersioning and third-party library authors that do not adhere to open source specifications allow users to unknowingly and passively introduce hard-to-find problems. You can usego mod vendorInstead, if you want to lock the version, use replace.

TOP 5: iOS Performance Optimization Practices: Toutiao Douyin How to achieve OOM crash rate reduction by 50%+

Client #iOS #OOM

The cause of iOS OOM crash in the production environment has been a long-standing problem in the industry, and bytedance’s toutiao, Douyin and other products are also facing the same problem.

In the research and development practice of bytedance performance and stability assurance team, we developed an OOM attribution scheme — Online Memory Graph, which is based on Memory snapshot technology and can be applied to production environment. Based on this solution, the crash rate of Toutiao Douyin OOM decreased by 50%+ within 3 months.

This paper mainly shares the technical background, technical principle and usage of the solution, aiming to provide a new solution for this difficult problem.

TOP 6: Bytedance’s practices on the Go Network library

#Go # Infrastructure

As an important part of r&d system, RPC framework carries almost all service traffic. This paper will briefly introduce the design and practice of Netpoll, bytedance’s self-developed network library. As well as the actual encountered problems and solutions, I hope to provide you with some reference.

TOP 7: YARN optimization and practice in Bytedance

#YARN # Infrastructure

This paper introduces bytedance’s optimization of Hadoop YARN in the past four years and its practical experience in production environment from four aspects of utilization improvement, multi-load scenario optimization, stability improvement, and remote live.

TOP 8: Fastbot: Smart Monkey on the Move

# test Fastbot

Fastbot is an intelligent GUI testing service based on model. We split the service into the client and the server. The client does GUI monitoring and action injection, and the server does model building and algorithm decision-making. We use reinforcement learning to develop a variety of intelligent search algorithms, which can effectively avoid the problem of partial dead-loop in GUI testing and greatly improve test coverage.

TOP 9: Tiktok Package Size Optimization – Resource optimization

#Android # Performance Optimization # Package size

With rapid business iteration, the package size of Douyin Android has exploded. Package size has a direct impact on download conversion rates, promotion costs, running memory, and installation time, so slimming down APK is a worthwhile and profitable business. Apk mainly consists of DEX, Resource, implies, Native Libraries and Meta-data, and package size optimization can be done for each part.

Among them, resources account for a large proportion in apK package volume, and optimizing for resources is a very important part of package size optimization. In line with the principle of pursuing perfection, this paper elaborates the optimization measures for the resource part of Douyin Android terminal.

TOP 10: Tiktok Android Performance Optimization series: Java Memory Optimization

#Android # Performance optimization #JAVA #OOM

Memory is one of the most important resources for computer programs to run. It is necessary to allocate and recycle resources reasonably during the running process. If the memory usage is not reasonable, the application will run out of time, ANR and black screen, or if the application is OOM (out of Memory) crashed. As a product widely used by users, Douyin needs to maintain excellent fluency and stability on various machine resources, and memory optimization must be paid attention to.

Based on the governance practice of Douyin Java OOM memory optimization, this paper tries to share some thoughts of Douyin team on Java memory optimization, including tool construction and optimization methodology.


Welcome to Bytedance Technical Team

Resume mailing address: [email protected]