Coroutine, C language implementation

Interestingly, this is not a popular science article.

What is a coroutine

As usual, this article should cover what a coroutine is rather than getting into the nitty gritty details of a broad implementation, which isn’t that complicated. But there are so many articles about what coroutines are, some are right, some are wrong, and some are made up. In a word, master such as cloud, since need not say more; Erroneous, although wrong, but is not a great evil; The rest, you just make things up, you make amazing things, you just don’t see them.

But what exactly is a coroutine? Some see coroutines as lightweight threads, others as variations on callbacks. The former is indeed biased. I can’t find anything wrong with the latter. After all, most coroutines can be overridden with callbacks, but from my own understanding, coroutines can also be thought of as context-preserving units of execution that give up execution but can’t be preempted. In the early days of the operating system, the use of coroutines to simulate concurrency was replaced by preemptable threads, which is why lightweight threads are considered biased.

Either way, callbacks or non-preemptible execution units, per se, don’t help performance much. Interestingly, if I say that coroutines make programs run more efficiently, no one will believe me; Some people would laugh at me for being stupid if I said that callbacks make the program run much more efficiently. Three and four, four and three, what’s the difference.

Duff’s device

Duff’s device refers to the C language’s ability to interweave switches and other control statements to manually unroll loops. A typical Duff’s device is shown below.

switch(i){
    case 1:
        if(j == 0){
            break;
        }
        case 2:
            if( b == 1){
                case 3:
            }
Copy the code

At its core, Duff’s device is more like syntactic sugar. Of course, this might seem useless, but this is in C

The most straightforward and simplest technique for implementing coroutines in a language. A simple coroutine pseudocode looks like this: static int fd = 0; void jobs(){ static int i = 0; switch(i){ case 0: addr = get_addr(); fd = connect(addr); // establish connection I = 1; return ; case 1: data,len = read(fd); printf("%.*s ",len,data); // Read data I = 2; return ; case 2: len = write(fd,data); // Write data I ++; return ; } } tnet_worker[0].call = jobs; tnet_worker[0].resume = jobs; select(nfds,rfds,wrds,NULL,NULL){ if(FD_ISSET(sigfd,rfds)){ tnet_worker[0].call(); add2rfds(fd); } if(FD_ISSET(fd,rfds)){ tnet_worker[0].resume(); } if(FD_ISSET(fd,wrds)){ tnet_worker[0].resume(); }}Copy the code

This is basically a prototype of a simple coroutine, and it’s not what some people might think of as a coroutine, but unfortunately, that’s what we call a coroutine.

Static is used to hold the context, and return is used to make the execution function logically non-preemptable. That covers the key elements of coroutines.

More interestingly, you can replace them with function callbacks in the form of do-connect,do-read,do-write, etc. It didn’t make us perform better or worse, but it did make our code logic clearer. At this point, we could create thousands of Tnet_workers to handle each connection, which might be like a thread, but as we already know, it’s not like a thread at all.

Duff’s Device is great, but there is a serious problem that local variables can’t be saved across functions. I could create thousands of coroutines, but they all share a stack. As you can see from the previous section, I used a static local variable to hold the context of the coroutine, which is fine for per coroutine per entries, but not for multi-coroutine single entries. Simon Tatham in www.chiark.greenend.org.uk/~sgtatham/c… A C code macro based on Duff’s device is provided

It also wraps up the problem of preserving context for a single entry. But we’re essentially using static local variables. This leads to a lot of assumptions that need to be made to use this macro, which we won’t go into here. (BTW, I really like this macro, it does make the logic of the code clearer, especially for C code).

For this reason, we need to assign each coroutine its own stack space, and we need the ability to jump from stack to stack, which is beyond the reach of traditional control statements. The former, we need to talk a little bit more about, the latter is very simple, and those of you who are familiar with C know that standard C provides

The [SIG]longjmp/ setjMP family of functions provides the ability to save context jumps across functions. This is easy, so I won’t go into it.

ucontext

Let me start with three interesting facts:

Most coroutine implementations use uContext
Ucontext is a deprecated function that can no longer be found in the latest POSXI standard.
POSIX suggests using PThreads as a possible alternative.

Ucontext by nature has its own stack space and by nature provides for jumping to and from each other. So it’s actually a little bit easier to implement, but I won’t go into it too much considering the fact that class changes are pretty much obsolete.

sigaltstack

__sigaltstack — set and/or get signal stack context__

As the name suggests, this is a function related to signals. The idea is to use a user-specified stack to execute a signal handler. A typical usage scenario can be considered in stack overflow scenarios, where the resulting SIGSEGV signal processing function is actually unable to execute because the current stack has overflowed.Copy the code

Interestingly, this also gives us an opportunity to grab a stack. That is, a coroutine with a separate running stack space can be created as follows:

Provides a new signal processing function execution stack.
Sets the execution function of a signal and implies execution on a new stack
Send the signal and wait for the signal processing execution to end
1. Use setjMP in the signal execution function to get the context of the stack as your own.
A handler that restores the signal
Restore the signal handler stack
Now we can successfully jump to our own stack by longjMP. With Duff’s device, we can achieve a relatively complete coroutine. If the signal processing is more understanding of the students, the specific implementation is relatively simple. I’m not going to do much here.

Thread

I recently learned that some languages use threads to simulate coroutines, and I found it interesting that coroutines have such magic. It reminds me of an anecdote from high school. Accompany learning to buy milk tea, to a cup of red bean milk tea, not red bean.

Afterword.

After all this talk about coroutine implementation, does coroutine really solve any program performance problems? I don’t want to talk about the fancy tricks of implementing coroutines, but by doing so, we can clearly see what a coroutine is. It’s not a panacea. From a machine perspective, it’s just a higher-level JMP instruction.

If one day you have to use so-called coroutines to reduce the cost of thread switching, you can ask yourself if you really need so many threads. Threads are never created by themselves. they are created by people.

Nowadays, most people are used to designing and developing from the point of view of human beings, but sometimes, there will be some different harvest from the point of view of machines.

C coroutine practices —– simple turn-based games

Consider A simple turn-based game where the protagonist A is fighting an NPC. Let’s assume the rules of the game are as follows:

Protagonist A attacks, and then the NPC attacks. Cycle until one of them drops below zero.
Both hero and NPC have 10 initial attack damage.
Both main characters and NPCS have dodge rate and crit rate.
If you dodge successfully, you take no damage.
Once critically struck, the damage is doubled.

An implementation of a coroutine is as follows (I’m using the PCL library here) :

#include "pcl.h "

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

struct attacker_st{

    uint64_t        damage ;
    int64_t         HP     ;
    coroutine_t     coro   ;

    uint8_t         critical_rate; // percent
    uint8_t         dodge_rate;
};
Copy the code

static struct attacker_st attacker_buda = {
    .HP                 =   100,
    .critical_rate      =   80,
    .dodge_rate         =   50,
};

static struct attacker_st attacker_duzi = {
    .HP                 =   100,
    .critical_rate      =   20,
    .dodge_rate         =   20,
};
Copy the code

uint64_t compute_damage(uint8_t critical_rate)
{
    uint32_t r = rand() % 100 + 1;
    if(critical_rate > r){
        return (20);
    }
    return (10);
}

int dodge(uint8_t dodge_rate)
{
    uint32_t r = rand() % 100 + 1;
    if(dodge_rate > r){
        return (1);
    }
    return (0);
}
Copy the code

void Attacker_BUDA(void *args) { while(attacker_buda.HP > 0){ attacker_buda.damage = compute_damage(attacker_buda.critical_rate); co_call(attacker_duzi.coro); // yeild if(attacker_duzi.HP <= 0){ // duzi is GG,break the loop break; } if(! dodge(attacker_buda.dodge_rate)){ attacker_buda.HP -= attacker_duzi.damage; } } } void Attacker_DUZI(void *args) { while(1){ if(! dodge(attacker_duzi.dodge_rate)){ attacker_duzi.HP -= attacker_buda.damage; } if(attacker_duzi.HP > 0){ attacker_duzi.damage = compute_damage(attacker_duzi.critical_rate); } co_resume(); } } int main() { uint8_t i ; srandom((long int) &i); co_thread_init(); Attacker_buda. Coro = co_create (attacker_buda, NULL, 0327, 68); Attacker_duzi. Coro = co_create (attacker_duzi, NULL, 0327, 68); co_call(attacker_buda.coro); if(attacker_buda.HP >= attacker_duzi.HP){ printf("the Winner is Buda\n "); }else { printf("Shit,Duzi Win\n "); } co_thread_cleanup(); }Copy the code

Coroutine, C language implementation