Click open link

preface

Ibco library, opened by wechat in 2013, is a C/C ++ coprogramming library widely used in wechat background. It has been running stably on tens of thousands of machines in wechat background since 2013. Libco was opened for the first time in 2013 as one of Tencent’s six open source projects. Ibco supports the agile synchronous programming mode behind the background and provides the system with high concurrency capability. For the story behind the development of Libco, see the article “Wechat Asynchronous Transformation practice: Back-end Solutions behind 800 million Monthly Live and single Machine Connections”. \

Features supported by libco

Libco features: \

  • No need to invade business logic, the multi-process, multi-thread service into coroutine service, concurrency has been improved a hundred times;
  • Support for CGI framework, easy to build Web services (New);
  • Support gethostbyName, mysqlClient, SSL and other common third library (New);
  • Optional shared stack mode, easy access to ten million connections (New). \

Libco perfect compact coroutine programming interface: \

  • Class pthread interface design, through co_CREATE, CO_resume and other simple clear interface can complete the creation and recovery of coroutines;
  • Coroutine private variables of class __thread, coroutine semaphore co_signal (New) for communication between coroutines;
  • A non-language level implementation of lambda, which combines coroutines to write in place and perform background asynchronous tasks (New);
  • Small and light network framework based on Epoll/KQueue, high performance timer based on time roulette. \

\

The background of libco

In the early stage of wechat background, most modules adopted semi-synchronous and semi-asynchronous models due to complex and changeable business requirements and rapid product iteration. The access layer is an asynchronous model, while the business logic layer is a synchronous multi-process or multi-thread model. The concurrency of business logic is only tens to hundreds. With the growth of wechat business, the scale of the system is becoming larger and larger, and each module is vulnerable to back-end service/network jitter. \

The choice of asynchronous transformation

In order to improve the concurrent capability of wechat background, the general approach is to change all the services on the live network into an asynchronous model. This is a huge engineering effort, requiring a complete overhaul of everything from the framework to the business logic code, which is time consuming and risky. So we started thinking about coroutines. But using coroutines presents the following challenges: \

  • Industry coroutines have no experience in large-scale application in C/C ++ environment;
  • How to control coroutine scheduling;
  • How to handle synchronous style API calls, such as Socket, mysqlClient, etc.
  • How to handle the use of existing global variables, thread private variables. \

Finally, we solved all of the above problems with libco, and realized the asynchronous transformation of non-invasive business logic. Libco was used to carry out the asynchronous transformation of hundreds of modules in wechat background, and the business logic code was basically unchanged in the transformation process. Up to now, most of the services in wechat background have been multi-process or multi-thread coroutine model, and the concurrency capability has been improved qualitatively compared with before. Libco has also become the cornerstone of wechat background framework. \

Libco framework technical architecture diagram


Libco has three layers: an interface layer, a Hook layer, and an event-driven layer.



 \

Libco’s innovative treatment of the synchronous style API

For synchronous style apis, mainly synchronous network calls, libco’s primary task is to eliminate these waits and improve the concurrency performance of the system. For a normal network background service, we may go through connect, write, read and other steps to complete a complete network interaction. When these apis are called synchronously, the entire thread hangs waiting for network interaction. Although the concurrent performance of synchronous programming style is not good, it has the advantages of clear code logic, easy to write, and can support rapid iterative and agile development of business. To maintain the benefits of synchronous programming without modifying the existing business logic code online, libco innovatively takes over the network call interface (Hook), registering and resuming coroutines as an event registration and callback in asynchronous network IO. When a service process encounters a synchronous network request, the libco layer registers the request as an asynchronous event, and the coroutine frees up the CPU for another coroutine to execute. Libco automatically resumes coroutine execution in the event of a network event or timeout. We’ve taken over most of the synchronous style apis by Hook, and libco will schedule coroutine recovery execution at the appropriate time. \

Libco supports tens of millions of coroutines


Libco defaults to a single run stack for each coroutine, and allocates a fixed amount of memory from the heap as the run stack for the coroutine when it is created. If we use a coroutine to handle an access connection at the front end, then for a mass access service, the concurrency limit of our service can easily be limited to memory. For this reason, libco also provides stackless coroutine sharing stack mode, where several coroutines can be set up to share the same running stack. When switching between coroutines on the same shared stack, the current running stack contents need to be copied to the coroutine’s private memory. To reduce this number of memory copies, memory copies of the shared stack only occur between different coroutines. There is no need to copy the running stack when the occupant of the shared stack has not changed.



 



The shared coroutine stack mode of libco coroutines makes it easy for a single machine to access tens of millions of connections, simply by creating enough coroutines. We used libco shared stack to create 10 million coroutines (e5-2670 V3 @ 2.30GHz * 2, 128GB of memory). We shared 128K of memory for every 100,000 coroutines. The total memory consumption for stable Echo was approximately 66GB. \

Libco coroutine level private variables

When a multithreaded program is converted to a multithreaded program, we can quickly modify global variables using __thread. In the coroutine environment, we created the coroutine variable ROUTINE_VAR, which greatly simplifies the modification of coroutines. Because coroutines are essentially executed serially within threads, there can be reentrant problems when we define a thread-private variable. For example, we define a thread-private variable for __thread, which is intended to be exclusive to each execution logic. However, when our execution environment is migrated to coroutines, the same thread private variable may be manipulated by more than one coroutine, which leads to the problem of variable flushing. For this reason, when we did libco asynchronization, we changed most of the thread private variables to coroutine level private variables. Coroutine private variables have the property that when code is running in a multithreaded non-coroutine environment, the variable is thread private; This variable is coroutine private when code is running in a coroutine environment. The underlying coroutine private variables automatically complete the determination of the runtime environment and correctly return the desired value. Coroutine private variables play an important role in the transformation of the existing environment from synchronous to asynchronous, and we define a very simple and convenient method to define coroutine private variables, as simple as only one line of declaration code. \

Libco gethostbyname Hook method

For live network services, you may need to use the gethostbyName API to query DNS and obtain the real ADDRESS. During the coroutine transformation, we found that the socket family function of our hook was not applicable to Gethostbyname. When a coroutine called Gethostbyname, it would wait for the result synchronously, which led to the delayed execution of other coroutines in the same thread. Glibc gethostbyname has been used as a polling method to wait for events, but the poll method is not used as a poll method. Glibc also defines a thread-private variable, and switching between coroutines can cause data inaccuracies due to reentrant. The final gethostByName coroutine asynchrony is solved by Hook __poll methods and defining coroutine private variables. Gethostbyname is a synchronous query DNS interface provided by Glibc. There are many excellent asynchronous solutions for GethostbyName, but these implementations require the introduction of a third-party library and the underlying mechanism for asynchronous callback notification. Libco hook gethostbyname asynchronization without modifying glibc source code. \

Libco defines the concept of a coroutine semaphore

In a multithreaded environment, there is a need for synchronization between threads, such as the execution of one thread waiting for another thread’s signal, which is usually solved by using pthread_signal. In libco, we define the coroutine semaphore co_signal to handle concurrency between coroutines. A coroutine can use co_cond_signal and co_cond_broadcast to decide whether to notify one waiting coroutine or wake up all waiting coroutines. \

Libco technology summary

Libco is an efficient C/C ++ coroutine library, which provides a complete coroutine programming interface and common Socket family functions Hook, enabling rapid iterative development of services using synchronous programming model. With the stable operation of the past few years, Libco has played an important role as the cornerstone of wechat background framework. \

Source code package download


  libco-20161207snapshot(52im.net).zip\ (42.39KB, download times: 15, price: 1 gold) \

Latest source code address


github.com/52im/libco\