One, foreword

This article describes how glib designs threading libraries for Linux and Windows using the following two diagrams.

Linux platforms:

Windows:

Recently I wrote several articles about cross-platform application design ideas, some friends in the background message asking about some common cross-platform libraries, it seems that there are still many requirements in this aspect.

Cross-platform is the idea that you can compile executable programs that run on multiple platforms using code from the same application.

So how do you make your application code platform independent? There is an obvious need for a bridge layer in the middle, leaving the annoying platform-specific code that it doesn’t want to deal with to the middle layer.

To put it simply: the dirty work related to the platform is handled by the middle layer, and we only need to care about our own business layer when we write the application.

Without this middle layer, your code would be riddled with #if… # else code.

Glib is one such mid-tier cross-platform library, and it provides a number of common packages. The thread library is just one of them. In this article, we will learn how glib designs cross-platform thread libraries.

2. Introduction of Glib

At first glance, it’s easy to confuse Glib with Glibc, which are both gPL-based open source software but are completely different concepts.

Glibc is a standard C library implemented by GNU, and glib is a GTK + library.

So what is GTK +? Gnome is a desktop system based on GTK +. Glib is the unsung hero behind GTK.

Glib can be used on multiple platforms, such as Linux, Unix, Windows, and more. Glib provides an alternative to many of the standard, commonly used C language constructs.

As a C developer, we sometimes envy C++ developers for the plethora of tools available in the standard library (SDL) : linked lists, vectors, string manipulation…

But what about C? You have to make those wheels work for yourself.

On the other hand, if we accumulate the useful wheels we wrote and borrowed from other places to form our own “treasure house” in the daily development process, this is also a manifestation of experience and competitiveness.

Today, there are many high quality C libraries on Github: some focus on cross-platform, some focus on a domain (e.g. network processing, formatted text parsing).

Glib provides many other useful toolkits for cross-platform solutions, such as event loops, thread pools, synchronous queues, memory management, and more.

Since it provides many functions, it will inevitably lead to large size. This is why many developers abandon Glib when faced with different options.

However, since Glib is so good, we can learn its design ideas, which can improve a person’s meta-skills more than blindly typing thousands of lines of code!

Third, the design of thread library

1. Thread-related files

On Linux, threads are typically created using POSIX (portable operating system interface). For example, the thread creation API function is pthread_create(…). .

In Windows, there are several ways to create threads:

  1. CreateThread()

  2. _beginthread()

Since the Glib library is designed to solve cross-platform problems, it must provide a unified interface to application-layer applications. When you go down to different operating systems, you call thread functions in different systems.

Glib encapsulates these thread-specific operations in platform-specific code, as shown below:

  1. Linux: Gthread. c and gthread_posix.c participate in compilation and generate glib library.

  2. Windows: Gthread. c, gthread_win32.c participate in compilation and generate glib library.

For more on this cross-platform approach to file building (i.e., compilation), I suggest you take a look at this short article: 3 ways to organize cross-platform code

2. Data structure

You’ve probably heard of the formula: program = data structure + algorithm. For a C project, understanding the design of data structures is crucial to understanding how a program works, and glib is no different.

Glib designs thread libraries in two layers: platform-independent and platform-dependent.

The platform-independent data structures are (some of the code is deleted without affecting understanding) :

struct  _GThread
{
  GThreadFunc func;
  gpointer data;
  gboolean joinable;
};

typedef struct _GThread GThread;
Copy the code
struct  _GRealThread
{
  GThread thread;

  gint ref_count;
  gchar *name;
};

typedef struct _GRealThread GRealThread;
Copy the code

The platform-related data structures are:

Linux system:

typedef struct
{
  GRealThread thread;

  pthread_t system_thread;
  gboolean  joined;
  GMutex    lock;

  void *(*proxy) (void *);
  const GThreadSchedulerSettings *scheduler_settings;
} GThreadPosix;
Copy the code

Windows:

typedef struct
{
  GRealThread thread;

  GThreadFunc proxy;
  HANDLE      handle;
} GThreadWin32;
Copy the code

If you take a closer look at the first member variable of each structure, do you see anything?

From the perspective of hierarchical relationship, the relationship of these structures is as follows:

Linux platforms:

Windows:

What does a structure mean in the memory model? Occupies a block of memory space.

These data structures all put the “child” structure in the first place of the “parent” structure, so that it can be easily cast.

In the above memory model, the first part of the GRealThread structure is GThread, so it is perfectly possible to treat the beginning of the memory where the GRealThread resides as a GThread structure variable.

It’s more accurate to use object-oriented C++ terms: Pointers to base classes can point to objects of derived classes.

You can see this in the code below.

3. Create a thread

(1) Function prototype

Platform-independent functions (implemented in GThread.c)

GThread *g_thread_new (const gchar *name,
              GThreadFunc  func,
              gpointer     data);
Copy the code
GThread *
g_thread_new_internal (const gchar *name,
                       GThreadFunc proxy,
                       GThreadFunc func,
                       gpointer data,
                       gsize stack_size,
                       const GThreadSchedulerSettings *scheduler_settings,
                       GError **error);
Copy the code

Platform specific functions (implemented in gthread_posix.c or ghread_win32.c)

GRealThread *
g_system_thread_new (GThreadFunc proxy,
                     gulong stack_size,
                     const GThreadSchedulerSettings *scheduler_settings,
                     const char *name,
                     GThreadFunc func,
                     gpointer data,
                     GError **error);
Copy the code

(2) Linux platform function call chain

Let’s take a look at the function call relationship on the Linux platform:

If you have the source code handy, look at the func and data arguments in the g_thread_new() function.

Func is the function that the user created the thread to execute when it was first passed in by the user layer. Data is the function argument received by the func function.

When you program directly to a Linux operating system, the POSIX interface function pthread_create() is called, passing in the functions and arguments that the user wants to execute.

But instead of passing user-level functions directly to the Linux operating system, the Glib layer provides two threaded proxy functions that, depending on the situation, are passed to the operating system when pthread_create() is called:

The first thread proxy function: g_thread_proxy();

The second thread proxy function: linux_pthread_proxy();

Which agent function to pass depends on whether the macro definition HAVE_SYS_SCHED_GETATTR is valid.

Here is the simplified code for the g_system_thread_new() function:

g_system_thread_new (proxy, stack_size, scheduler_settings, name, func, data, error); GThreadPosix *thread; GRealThread *base_thread; Func = func; // Base_thread ->thread.func = func; base_thread->thread.data = data; thread->scheduler_settings = scheduler_settings; thread->proxy = proxy; #if defined(HAVE_SYS_SCHED_GETATTR) ret = pthread_create (&thread->system_thread, &attr, linux_pthread_proxy, thread); #else ret = pthread_create (&thread->system_thread, &attr, (void* (*)(void*))proxy, thread); #endifCopy the code

4. Thread execution

Let’s assume that the macro definition HAVE_SYS_SCHED_GETATTR is defined and valid, and that the Linux pthread_create() receives the linux_pthread_proxy() function.

When the new thread is scheduled to execute, the linux_pthread_proxy() function is called to execute:

The simplified linux_pthread_proxy() function:

Static void * linux_pthread_proxy (void *data) {// data is a platform dependent pointer to GThreadPosix type in g_system_thread_new. GThreadPosix *thread = data; If (thread->scheduler_settings) {// Set the thread attribute tid = (pid_t) syscall (SYS_gettid); res = syscall (SYS_sched_setattr, tid, thread->scheduler_settings->attr, flags); G_thread_proxy () return thread->proxy (data); }Copy the code

This function focuses on three things:

  1. The data argument: is a pointer of type GThreadPosix in the g_system_thread_new function, which is platform specific.

  2. The middle part is to set the thread properties;

  3. The last return statement calls g_thread_proxy, the first thread proxy function in glib.

Continue with the simplified code for this function:

Gpointer g_thread_proxy (gpointer data) {// Data is a platform dependent pointer to GThreadPosix type in g_system_thread_new. // Force it to platform-independent GRealThread. GRealThread* thread = data; G_system_thread_set_name (thread->name); Retval = thread->thread.func (thread->thread.data); return NULL; }Copy the code

There are only three things to note about this function:

  1. Data: the linux_pthread_proxy function passes Pointers to GThreadPosix, but assigns Pointers to GRealThread directly because their memory model is containing;

  2. The middle part sets the thread name;

  3. The final thread->thread.func (thread->thread.data) statement calls the function originally passed by the user and passes the user’s data argument.

At this point, user_thread_func(data), the thread function defined by the user layer, is executed.

So, if the glib layer does not define the macro HAVE_SYS_SCHED_GETATTR, then pthread_create() in Linux receives g_thread_proxy, the first thread proxy function in glib.

The thread executes the following call relationship:

5. Windows platform function call chain

Let’s look at the function call relationship when creating a thread on Windows:

On Windows, the thread proxy function for glib is g_thread_win32_proxy().

When the new thread is scheduled to execute, the function call relationship is:

Four,

The key to implementing this thread function proxy design is to use the C language structure type to cast the “parent” structure type variable into the “child” structure type variable to use, because the two in the memory model, in the beginning part of the space, the content is exactly the same.

Finally, I combined these diagrams into the following two diagrams, which fully reflect the thread design idea in Glib:

Linux platforms:

Windows:








[1] C language pointer – from the underlying principle to the tricks, with graphics and code to help you explain thoroughly [2] step by step analysis – how to use C to achieve object-oriented programming [3] The original GDB underlying debugging principle is so simple [4] inline assembly is terrible? Finish this article and end it! [5] It is said that software architecture should be layered and divided into modules. What should be done specifically